I have developed an interest in applied statistics, especially data science (if you accept that the concept of data science is valid, which not everyone does). Right now I'm informally studying math, including statistics, as well as programming and general data science techniques. It will probably take me about two more years before I have the skills needed to do actual research, and even then I'll be doing it solo for fun, but in the meantime I'd like to practise some of the skills involved.
One practice involved in some kinds of data science is called "scraping," which is when you download webpages into html files on your hard drive. It's usually automated with a computer program or ready-made software, which is fairly easy to find. It's done all the time but, except when used by for-profit businesses to do research on competitors, it's considered ethical to ask permission of a website's owner before scraping the website. What I'd like to do is ask permission of everyone involved in A Lonely Life Forums, including all users who choose to respond, to scrape the public, non-identifying, non-sensitive and non-confidential parts of the forum just to practise data gathering, and also data cleaning (which is assembling the data into tables). If anyone whatsoever is not comfortable with this, and/or if the website's owner says no, I shan't do it. Let me know.
One practice involved in some kinds of data science is called "scraping," which is when you download webpages into html files on your hard drive. It's usually automated with a computer program or ready-made software, which is fairly easy to find. It's done all the time but, except when used by for-profit businesses to do research on competitors, it's considered ethical to ask permission of a website's owner before scraping the website. What I'd like to do is ask permission of everyone involved in A Lonely Life Forums, including all users who choose to respond, to scrape the public, non-identifying, non-sensitive and non-confidential parts of the forum just to practise data gathering, and also data cleaning (which is assembling the data into tables). If anyone whatsoever is not comfortable with this, and/or if the website's owner says no, I shan't do it. Let me know.