Know Your Data 1. The ArnetMiner citation dataset (provided by arnetminer.org) by year 2012 can be downloaded in the attached file.

Know Your Data
1. The ArnetMiner citation dataset (provided by arnetminer.org) by year 2012 can be downloaded in
the attached file.
(1) Count the number of authors, venues (conferences/journals), and publications in the datasets.
(2) What are the min, max, Q1, Q3, and median number of publications per author? Can you plot
the histogram for number of publications per author?
(3) What are the min, max, Q1, Q3, and median number of citations per author? Can you plot the
histogram for number of citations received per author?
(4) Please plot the scatter plot between the number of publications vs. the number of citations for
authors who have more than 5 publications.
Classification for Matrix Data
2. Decision Tree
Construct a decision tree for the following training data, where “Edible” is the class we are going to
predict. Information gain is used to select the attributes. Please write down the major steps in the
construction process (you need to show the information gain for each candidate attribute when a new
node is created in the tree).
3. Naïve Bayes
Consider a Naïve Bayes model for spam classification with the vocabulary V = {secret, offer, low, price,
valued, customer, today, dollar, million, sports, is, for, play, healthy, pizza}, where each word in the
vocabulary is considered as a feature, and their values could be either 1 or 0, denoting whether they exist
in one message. We have the messages and labels in the following table:Messages Class label
Million dollar offer Spam
Secret offer today Spam
Secret is secret Spam
Low price for valued customer non-spam
Play secret sports today non-spam

Click here to place an order for a similar paper and have exceptional work done by our team and get A+results

Sports is healthy non-spam
Low price pizza non-spam
Give the MLEs for the following parameters:

Fill in Order Details

Make Payment Securely

Writing Process

Download your paper

Know Your Data 1. The ArnetMiner citation dataset (provided by arnetminer.org) by year 2012 can be downloaded in the attached file.

5.0

4.9

4.9

WHAT OUR CURRENT CUSTOMERS SAY

Consider Your Assignments Done

“All my friends and I are getting help from eliteacademicresearch. It’s every college student’s best kept secret!”

“I was apprehensive at first. But I must say it was a great experience and well worth the price. I got an A!”

Our Top Experts

Pro. M

754

459

Tutor Green

845

599

Doctor Pearce

886

463

Pro. M

855

432

Tutor Green

741

566

Doctor Pearce

759

453

See Why Our Clients Hire Us Again And Again!

OVER

10.3k
Reviews

RATING
4.89/5
Average

YEARS
13
Mastery

Success Guarantee

See our Results

Fill in Order Details

Make Payment Securely

Writing Process

Download your paper

5.0

4.9

4.9

WHAT OUR CURRENT CUSTOMERS SAY

Consider Your Assignments Done

“All my friends and I are getting help from eliteacademicresearch. It’s every college student’s best kept secret!”

“I was apprehensive at first. But I must say it was a great experience and well worth the price. I got an A!”

Our Top Experts

Pro. M

754

459

Tutor Green

845

599

Doctor Pearce

886

463

Pro. M

855

432

Tutor Green

741

566

Doctor Pearce

759

453

See Why Our Clients Hire Us Again And Again!

OVER 10.3k Reviews

RATING 4.89/5 Average

YEARS 13 Mastery

Success Guarantee

See our Results

OVER

10.3k
Reviews

RATING
4.89/5
Average

YEARS
13
Mastery