Research Intern

DatologyAI


Date: 2 weeks ago
City: Redwood City, CA
Contract type: Intern
About The Company

Companies want to train their own large models on their own data. The current industry standard is to train on a random sample of your data, which is inefficient at best and actively harmful to model quality at worst. There is compelling research showing that smarter data selection can train better models faster-we know because we did much of this research. Given the high costs of training, this presents a huge market opportunity. We founded DatologyAI to translate this research into tools that enable enterprise customers to identify the right data on which to train, resulting in better models for cheaper. Our team has pioneered deep learning data research, built startups, and created tools for enterprise ML.

Following our $11.65M Seed round last September, we've raised a $46M Series A led by Felicis Ventures. Our investors include Radical Ventures, Amplify Partners, Microsoft, Amazon, and notable angels like Jeff Dean, Geoff Hinton, Yann LeCun and Elad Gil. With over $57.5M in total funding, we're rapidly scaling our team and computing resources to revolutionize data curation across modalities.

Join us in pushing the boundaries of what's possible in AI! Learn more about the company here.

About The Role

This role is closed for summer 2024 since we have reached capacity, but we will still accept intern candidates for fall and winter!

As a Research Intern At DatologyAI, You Will Conduct Research Investigating How Intervention On Training Data Can Improve The Quality And Shape The Behavior Of Deep Learning Models. Here Is What Your Day-to-day Would Look Like

  • Transform messy literature into practical improvements. The research literature is vast, ambiguous, and constantly evolving. You will use your skills as a scientist to source, vet, implement, and improve promising ideas from the literature and your own creation.
  • Perform High-Risk, High-Reward Research. We want our interns to focus on problems that have massive potential to transform how data is ingested into future ML models. Rather than making incremental changes to current algorithms, we want you to work on novel project ideas that could change how we view data.
  • Conduct science driven by real-world needs. At DatologyAI, we understand that conference reviewers and academic benchmarks don't always incentivize the most impactful research. Concrete customer needs and product improvements will guide your research.
  • Science is more than just experiments. We expect our Research Scientist Interns to collaborate closely with engineers, talk to customers, and shape the product vision

About You

Ideal candidates should have strong coding skills with experience with one of the following

  • We would like to hire students with practical experience and/or publications related to any of the following research topics:
  • Data research
  • Data pruning/curation
  • Curriculum learning
  • Synthetic data generation
  • Dataset distillation
  • Effects of training data on model behavior
  • Embedding models
  • Semantic search
  • Efficient ML
  • We would love to have you if you have practical experience and/or publications related to training large vision (especially video), language, and multimodal models.
  • Or teach us something new that you are passionate about that could improve data curation!

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

Content Programmer Intern

Electronic Arts (EA), Redwood City, CA
5 days ago
Marketing Intern, ActivationLocation - Los Angeles, CA, Redwood Shores, CA, and Toronto, CAN - Must be located in the local areas during the summer of 2025.Growth Marketing is at the forefront of digital and generative transformation. We drive deep engagement with hundreds of millions of fans and deliver billions of messages through audience-first planning, world-class creative content production, and advanced...

Junior AI/ML Software Developer

Paradyme, Inc., Redwood City, CA
1 week ago
OverviewParadyme Management is a rapidly growing government technology leader that puts service first, for its customers, its team and the communities it supports. Paradyme harnesses DevSecOps and Agile development processes to deliver exceptional results for digital transformations. With headquarters office in Tysons Corner, VA, Paradyme’s award-winning culture sets it apart through its team’s deep commitment to service and collaboration with...

Production Planner, GMP

8VC, Redwood City, CA
3 weeks ago
Synthego is enabling genome engineering at scale with a blend of scientific instrumentation, industrial automation, and data science. Our multidisciplinary team of scientists and engineers work together to solve complex challenges that cannot be tackled by a single field alone.We are seeking an experienced and innovative leader to oversee production planning and order fulfillment for our CRISPRevolution Halo Platform. This...