Fill in Order Details

  • Submit paper details for free using our simple order form

Make Payment Securely

  • Add funds to your account. There are no upfront payments. The writer will only be paid once you have approved your paper

Writing Process

  • The best qualified expert writer is assigned to work on your order
  • Your paper is written to standard and delivered as per your instructions

Download your paper

  • Download the completed paper from your online account or your email
  • You can request a plagiarism and quality report along with your paper

data science and longest common subsequence python

I have attached the .txt file, the function to import the .txt file, and the recursive longest common subsequence code.

Q2) (20 points) Data Science is one of the most popular areas where Python is widely used. In this question, you will have an opportunity to put your tiny tiny first baby step into this territory. You will also develop your own simple hash table using the existing standard Python data types.

  1. Modify the LCS recursive version to count the number of recursive calls for each 6-digit integer string against “0123456789” string. The function returns a tuple of two elements (LCS num, the number of recursive calls to find the LCS)
  2. I think I found a pretty good (but very slow) hash function of integer strings, which is the recursive LCS function. 🙂 Use this function as your hash function to store the integer strings you read in Q1) into your own hash table. Please do not use Python dict() data type directly. You should develop your own hash table where the keys are the number of recursive calls as computed from the recursive LCS function.
  3. Now, regarding how good the new hash function is in terms of generating keys uniformly, we want to check it by counting the number of collisions for each and every key in the hash table from 1M integers. It would be very ideal if the average collision number is close to 100 for 10,000 buckets out of 1M numbers. We can get an idea of it by computing the average number of collisions, but we may also visualize the distribution of the collisions across all the keys using Python plot library.
  4. Use a plotting library (https://plot.ly/python/ (Links to an external site.)) to visualize the distribution of key collision of the LCS hash function:

import matplotlib.pyplot as plt
from plotly.graph_objs import Bar, Layout
from plotly import offline

# or any other library of your choice such as panda, if you prefer

    • Draw an offline bar graph to show the relationship between hash keys (the number of recursive calls to compute the LCS against “0123456789” ) and the number of collisions for each hash key. You may refer to the chart that I came up with using python dict() class (I could have used collections. Counter class for that matter) but again you should use your own hash table for this homework which maps each key to the corresponding list of collisions. In any case, the final chart should look very similar, if not identical. << lcsUniformity.html >>

5. From my personal experience, I firmly believe that no meaningful data analysis can be completed without interjecting human intuition in the data interpretation process. Let’s exercise our own. Look at my chart mentioned above. It’s not perfect but it appears fairly uniformly distributed across the keys (LCS numbers). But it you look at it closely, after passing certain threshold, the collision rate tapering off, which means that bigger the number of recursion gets, smaller the collisions (frequencies of hash keys generated in the range) gets. They are thinly spread out. Also note that these numbers would take much longer time to get a hash key using the LCS hash function because it makes many more recursive calls to finish the recursion. So I would like to limit the number of recursive calls when it passes a certain threshold. With that in mind,

  • using your intuition, from your own bar graph, pick up a threshold number and give your reasons explaining why you picked up the number. No right or wrong answer here.
  • To handle those numbers that go over the threshold, you may develop and run a simple secondary hash function of your own choice to collapse those outliers to a shorter range so that the upper bound of entire hash table of yours is below or equal to 10,000.
  • Then run your new algorithm to redraw the distribution across new set of keys added to the existing keys.

Since running LCS for the million numbers in rand1000000.txt file takes quite a long time, you may first test it with rand1000,txt file ( less than 10 seconds)

Q2 Deliverable :

  • Python codes of modified LCS recursive and the driver program
  • The offline html files of the hash table collision charts for both before and after modification

WHAT OUR CURRENT CUSTOMERS SAY

  • Google
  • Sitejabber
  • Trustpilot
Zahraa S
Zahraa S
Absolutely spot on. I have had the best experience with Elite Academic Research and all my work have scored highly. Thank you for your professionalism and using expert writers with vast and outstanding knowledge in their fields. I highly recommend any day and time.
Stuart L
Stuart L
Thanks for keeping me sane for getting everything out of the way, I’ve been stuck working more than full time and balancing the rest but I’m glad you’ve been ensuring my school work is taken care of. I'll recommend Elite Academic Research to anyone who seeks quality academic help, thank you so much!
Mindi D
Mindi D
Brilliant writers and awesome support team. You can tell by the depth of research and the quality of work delivered that the writers care deeply about delivering that perfect grade.
Samuel Y
Samuel Y
I really appreciate the work all your amazing writers do to ensure that my papers are always delivered on time and always of the highest quality. I was at a crossroads last semester and I almost dropped out of school because of the many issues that were bombarding but I am glad a friend referred me to you guys. You came up big for me and continue to do so. I just wish I knew about your services earlier.
Cindy L
Cindy L
You can't fault the paper quality and speed of delivery. I have been using these guys for the past 3 years and I not even once have they ever failed me. They deliver properly researched papers way ahead of time. Each time I think I have had the best their professional writers surprise me with even better quality work. Elite Academic Research is a true Gem among essay writing companies.
Got an A and plagiarism percent was less than 10%! Thanks!

ORDER NOW


Consider Your Assignments Done

“All my friends and I are getting help from eliteacademicresearch. It’s every college student’s best kept secret!”

Jermaine Byrant
BSN

“I was apprehensive at first. But I must say it was a great experience and well worth the price. I got an A!”

Nicole Johnson
Finance & Economics

Our Top Experts

See Why Our Clients Hire Us Again And Again!


OVER

10.3k
Reviews

RATING
4.89/5
Average

YEARS
13
Mastery

Success Guarantee

When you order form the best, some of your greatest problems as a student are solved!

Reliable

Professional

Affordable

Quick

Using this writing service is legal and is not prohibited by any law, university or college policies. Services of Elite Academic Research are provided for research and study purposes only with the intent to help students improve their writing and academic experience. We do not condone or encourage cheating, academic dishonesty, or any form of plagiarism. Our original, plagiarism-free, zero-AI expert samples should only be used as references. It is your responsibility to cite any outside sources appropriately. This service will be useful for students looking for quick, reliable, and efficient online class-help on a variety of topics.