Conduct research based on the information below, using R. After analyzing the data in R, document the research and findings in a research paper in APA 7 format. Ask questions, if needed.
Topic: Stack Overflow hosts an annual survey for developers. The study for 2019 includes almost 90,000 respondents (Stack Overflow, n.d.a).
Problem: Surveys usually contain instructions for participants that direct them to answer to the best of their ability. Inherently, this expectation of honest answers equates to consistent responses. Inconsistency can arise in a variety of ways, how one person interprets the question, versus the next, is one example. Another example is when the answers are multiple-choice, and more than one or none of the choices are appropriate to that respondent. In the study by Stack Overflow (n.d.b), respondents answered questions about employment and employment-related questions inconsistently. Modeling the survey results can present new insight into these inconsistencies. Question: Using a neural network and a random forest model and the Stack Overflow (n.d.b) data, will the survey responses to employment, developer status, and coding as a hobbyist, along with the answers to an open-source sharing question provide sufficient information to predict how the participant responded to the question about their student status?
Data: • The data and data dictionaries are online. o Note: The raw data in your program must be in the original form. Do not modify the data outside of the programming. Use the data dictionary to understand the data. o You can read Stack Overflow’s (n.d.a) report on the survey. ▪ Stack Overflow. (n.d.a). Developer survey results: 2019. Retrieved May 24, 2020, from [login to view URL]" class="redactor-linkify-object">[login to view URL]
o The data and data dictionary are downloaded together. When you visit this site, ensure you select the 2019 survey: ▪ Stack Overflow. (n.d.b). Stack overflow annual developer survey [dataset and code book]. Retrieved May 24, 2020, from [login to view URL] Requirements for this data analysis project: • Develop at least one additional well-developed research question. • When conducting data analysis, limit your research to the country of Netherlands. • Develop two classification algorithms, a neural network, and a random forest classifier. Attempt to create a classification model with an accuracy that exceeds 0.8 and the no-information-rate, when predicting the testing dataset. Tune the model(s), if they do not meet the sensitivity threshold. Compare the two models’ accuracy. • Do not forget to address the problem.
** • Explore the insights you can gain from this model and provide your interpretations when documenting your research. 5/20/20 Assignment 2 [login to view URL]
Required files to submit:
1) Research paper in APA 7 format; MS Word document file type
2) R Script; final version
Bonus challenge: Beyond the metric accuracy, explore the influence of the high no-information-rate in this analysis. The idea is for you to discover how the accuracy can be misleading, or when a higher accuracy score as a whole, may cover up the accuracy of individual labels in unevenly distributed labels. This challenge is specific to this data. Do not provide generic descriptions of the metrics; I am not interested in generic.
Tips: • MainBranch is the variable name for developer status. • There is a difference between OpenSourcer and OpenSource; make sure you understand which variable applies.
o The research paper will be written in a professional writing style, following APA 7 student paper format; you can use the student paper template.
o The document shall be 3-5 pages or at least 800 words. o Ensure that every reference in your reference list is also cited in the text. Do not forget to cite and reference the source of the data.
7 freelances font une offre moyenne de 64 $ pour ce travail
1. I am an expert in writing research report and also expert in R programming as well. I read your project description and I am sure that I can handle your project. 2. Also an expert in Academic writing, research re Plus
Hi, I am Ibrahim, and I am a data scienitst, I have great expertise, in statistics, R and Python. I have an experience in predictive algorithms and statistical softwares like SPSS. Please provide more details on the Plus
Hello there.. , I'm a Data Scientist and mathematician. I am well versed with: Algebra and analysis. I work on: Linear algebra Statistics - Statistical modeling using R -Markdown report Marching Learning usin Plus
Hello, I am a machine learning engineer with 2+ years of experience in application of statistical models on structured and unstructured data to derive predictive business insight, Technology Stack: R
Hello, Hope this message finds you well, I checked your details and I believe that my experience is what you are looking 4. I have been working on similar projects for the past eight years, and I have the essential sk Plus
I am proficient in R and efficient in doing the above-mentioned work as having an experience of coding in R while creating machine learning models.
I am an expert in courses like Statistics, Mathematics, R Programming Language, Statistical Analysis, SPSS Statistics, Microsoft Office, Data Entry.