About the Job
The Lifetime Value Co. is looking for a Web Scraping Engineer to join our growing team as a long term contractor and help us with processing new data that we acquire.
As a Web Scraping focused Engineer, you will be responsible for extracting and ingesting data from websites using web crawling tools. In this role, you will own the creation process of these tools, services, and workflows to improve crawl scrape analysis, reports and data management. We will rely on you to test the data and the scrape to ensure accuracy and quality. You will own the process to identify and rectify any issues with breaks as well as scale scrapes as needed.
At LTV, we all work closely together across teams so there’s no red tape or bureaucracy. We get things done!
What You Will Get to Do
- Responsible for extracting and ingesting data from websites using web crawling tools, and building and maintaining infrastructure to support those tools
- Own the creation process of these tools, services, and workflows to improve crawl/ scrape analysis, reports and data management
- Test the data and the scrape to ensure accuracy and quality
- Own the process to identify and rectify any issues with breaks as well as scale scrapers as needed
- Building and managing targeted web scrapers, including but not limited to ad-hoc scraping tasks and production-level regularly recurring scraping jobs
- Managing the pipeline and storage for the data of those scrapers
- Working closely with Data Scientists and Product Team to build and develop future data pipelines as defined by the business
What You Bring to the Table
- Experience running large scale web scrapers; ideally some familiarity with a big data stack
- Experience with system monitoring/administration tools
- Experience with version control, open source practices, and code review
- Experience with applications designed to display archived web content
- Knowledge of entity resolution best practices and ontology creation
- Strong database creation and administration knowledge; MySQL and NoSQL (elastic, PostgresSQL, graph-dbs)- Experience with streaming data sources and RESTful interfaces including familiarity with extracting data from publicly available API endpoints
- Experience in automotive industry web scraping a plus
- Experience extracting data from PDFs
- Experience in understanding Web Page Architecture
- Experience in digital image manipulation (converting images)
- Experience in AWS, RDS, Python, (especially Beautiful Soup) and Bright Data
Pluses:
- Experience with Apache Airflow.
- Experience working in an Agile development environment.
- Experience programming in Python and/or Golang.
- Experience with column storage (AWS Redshift, Google BigQuery).
- Experience with ETL tools.
Super Pluses:
- Experience with public record data.
Languages:
- English - Proficient level
Why LTV Co.?
If you have ambitions to be a part of a high-growth, results-driven, industry-leading organization, LTV is the place to be. LTV builds exciting data products and then we market them with passion. We’re a fast-growing company in New York City that balances the culture of a startup with the stability of being an established, profitable company. We want to work with people that strive to be in the top .01% of their field. We understand that getting to the top takes hard work, constant improvement, and by making data-driven decisions. It’s a thrilling time to join the team, as we’re expanding our product offerings in exciting new ways, driving innovation through data, marketing, and web & app development.
We believe in diversity and hiring people from all backgrounds and walks of life. You must be energetic, inventive, a team player, and looking to help build and grow the company each and every day. You must have an inner desire to win and the idea of losing is a non-starter. If you are looking for a position that allows you to work with a group of smart and dedicated people who will support you but still provide the autonomy you need to execute your strategy, then you should probably apply as soon as you’re done reading this!
About Us
LTV was founded in New York by Josh Levy and Ross Cohen in 2007. At the time their mission was to provide easy and affordable access to public records. Something that in 2007, was only really accessible to corporations. Since then their mission has expanded to developing products and services that grant access to information and data across a number of verticals. In service of this mission, LTV has 7 consumer brands including BeenVerified, NumberGuru, PeopleLooker, NeighborWho, Ownerly, PeopleSmart, and Bumper.
Our mission is to develop a diverse portfolio of technologies, products, and services, that gives all people equal access to unbiased data and information. We believe that through this access people can empower and protect themselves in today’s ever-changing world, filled with fake news, deception, and a lack of transparency.