Cornell Tech-funded startup launching bootcamp for data scientists

Cornell Tech-funded startup launching bootcamp for data scientists

The Data Incubator program is designed to make science and engineering PhDs better data scientists and quants

Advanced academic backgrounds in statistics, mathematics, and other science and technology fields usually provide the raw analytical skills required for a data scientist's job.

But even with such skills, some additional prep work is generally needed to handle such a job in private industry. The Data Incubator, a New York-based startup with funding from Cornell Tech, aims to do just that by offering a six-week bootcamp with programs designed to prepare science and engineering PhDs for careers as data scientists and quants.

The program is the brainchild of Michael Li, a former data scientist at Foursquare and a PhD in computational and applied mathematics from Princeton University, who used his experience transitioning from academia to private industry to design the program. The bootcamp will focus on helping academics sharpen their programming, communications and business skills.

Many large companies are literally awash in data and are desperately seeking people with the skills to help them extract business value from it. Therefore, solid academic backgrounds in computational and applied mathematics, statistics and other STEM areas are a hot commodity these days.

Li says job applicants with the unique combination of analytical, programming and communication skills needed to extract business value from massive, often chaotic data sets are hard to come by. People with deep programming skills often lack the analytic acumen for the job, while those with the analytical chops generally don't have the coding skills or industry knowledge, he added.

The six-week Data Incubator program will mentor academics in both the technical and non-technical skills needed to become top data scientists.

On the technical side, the program offers mentoring in areas like natural language processing, hypothesis testing, predictive modeling and data visualization, as well as classes on using programming tools like Python and NumPy, and on database and parallelization technologies like Hadoop and MapRed.

Program fellows will be guided through a portfolio project to demonstrate their skills and techniques as data scientists.

The bootcamp is free for those selected to attend. Companies that hire graduates of the program will be required to pay the candidate's training costs, Li said.

There is a huge burden in terms of time and resources that companies have to put into finding data scientists job, Li said. "You have to spend a lot of resources to figure out if someone who is good on paper in really good."

The Data Incubator program can identify how much difficult-to-learn math and statistical skills the students already have, and thus knows exactly what's needed to meet the criteria for a data scientist.

Getting into the program will be harder than getting admitted to Harvard University, promises Li. Fewer than 5% of the 1,000 individuals have already applied to participate in the inaugural bootcamp have been accepted. Li plans to conduct a minimum of four bootcamps annually.

"All our graduating fellows are looking to take positions as data scientists and quants in industry," he said.

Several companies, including Foursquare, Mashable, Truveris, Flatiron Health, which specializes in oncology-related data analytics, and online marketplace Etsy have already agreed to hire graduating fellows.

Truveris, a provider of analytics and reporting services to the pharmacy benefits industry, handles data pertaining to roughly one in 10 of all prescriptions filled in the U.S. The company helps insurance firms and others in the healthcare industry sift through mountains of prescription data to help drive better efficiencies and cut costs.

Finding people to do the complex analytics can be challenging said AJ Loiacono, co-founder and executive vice president of Truveris. So far, the company has had a fair degree of luck finding people through universities and word-of-mouth, he said.

Loiacono said he's impressed that the program focuses on the analytical and technical side as well as the applied programming and pragmatic skills. "We could try and get someone up to speed any time in house. But it is not a recipe for success," he said.

San Francisco-based Zipfian Academy, meanwhile, offers a crash course for people with quantitative backgrounds who are interested in a career in data science.

Unlike Li's bootcamp, Zipfian's immersive data science training program costs a cool $16,000 for 12 weeks. But it offers a $4,000 reimbursement to graduates that accept a position through its hiring program.

Zipfian also offers scholarships and payment plans to help program participants pay for the course. Among the companies hiring from Zipfian are Facebook, Khan Academy and Chartio.

Jaikumar Vijayan covers data security and privacy issues, financial services security and e-voting for Computerworld. Follow Jaikumar on Twitter at @jaivijayan or subscribe to Jaikumar's RSS feed. His e-mail address is

See more by Jaikumar Vijayan on

Read more about big data in Computerworld's Big Data Topic Center.

Follow Us

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags IT managementtrainingbusiness intelligencesoftwareapplicationsindustry verticalsdata miningEducation/TrainingBusiness Intelligence/Analytics

Show Comments