Big data project

Fermé Publié le il y a 6 ans Paiement à la livraison
Fermé Paiement à la livraison

Find Friends:

Objectives:

1- Give students the chance to practice a programming language that will be needed in the course

2- Handle and understand semi-structured data.

3- Extract the required information while it is not possible to use SQL queries, or database techniques.

4- Find patterns of data.

5- Find significant entities that have special characteristics.

6- Give students the chance to perform some data analytics steps.

1- You will be given a file named hobbies.txt.

a. This file contains a group of fictitious Facebook users and their hobbies.

b. Each line in the file contains a user/username and a list of hobbies of that user.

c. The data in each line is delimited by commas.

d. For instance in the line: 2254,reading,coding,swimming,playing soccer,

i. The user/username is: 2254

ii. The hobbies are: reading, coding, swimming, and playing soccer

iii. The number and type of hobbies may differ from one user to another.

2- This file will be your data set that your code has to read to be able to implement a code that does the following:

a. Finding circles/networks of friends:

i. In each circle you will report, all the users should share at least x number of hobbies

ii. x is a variable that a user can input to the program.

iii. Circles of friends should be written to a file named circles.txt.

iv. Each line should have the usernames in the circle/network you found, tab character, and list of shared hobbies.

v. for example, a line may look like: 2254,552,1258 reading,swimming,hiking

b. Finding popular users:

i. Popularity is based on being part of at least y circles/networks.

ii. y can be variable that a user can input to the program.

iii. Popular users should be written to a file named popular.txt. Each user and how many circles/networks the user belongs to, should be in separate line and separated by the tab character.

iv. For instance: 2254 5

v. This step should occur after step (a.).

vi. Hint: You may want to save the circles you found in part (a.) in some data structure so that you can us them in this part.

Notes:

2- You should be developing this project under the Linux machine (the Cloudera virtual machine) you should have installed at the beginning of this semester, without the need to install any special packages or libraries except the default compilers and libraries.

3- Name the solution file [login to view URL], [login to view URL], or facebook.scala.

4- Only one code file should be submitted per group. Your code should start with a block of comment.

5- This comment block has:

a. Students names, ids, and sections

6- You have to make sure that your code runs error-free, especially compilation errors.

a. We will not debug or fix any errors. Very low score is expected in this case.

7- Be careful about the Path names/information.

a. Always assume current folder/directory.

8- The command to run your code would be similar to: python2.6 [login to view URL] 5 6

a. 5 refers to the x in step a., and 6 refers to y in step b.

Big Data Sales Informatique en Nuage Science des données Machine Learning (ML)

Nº du projet : #15211569

À propos du projet

5 propositions Projet à distance Actif il y a 6 ans

5 freelances font une offre moyenne de 127 $ pour ce travail

shahzaib121

Hello We have a team expert in Big Data Analytics. We have done many project belong to Retail & Financial Domains. We have done these projects using different Big Data Tools Relevant Skills and Experience Our Team is Plus

%bids___i_sum_sub_32% %project_currencyDetails_sign_sub_33% USD en 1 jour
(18 Commentaires)
4.5