Factmata's mission is to allow anyone to discern and verify the credibility, quality, safety and reliability of online content. We are building state-of-the-art technology to semi-automatically score and verify news and social media using a combination of an expert network platform for machine learning, and advanced natural language understanding techniques. Our team consists of top researchers in NLP and machine learning, and we are backed by the founder of Twitter, Craigslist, Zynga and Broadcast.com.
Factmata is looking for an experienced and creative data engineer to help us solve the problem of misinformation online. At Factmata, we are building systems to automatically classify the credibility, quality, safety and reliability of online information, using state-of-the-art machine learning algorithms paired with human expertise.
We’re looking for someone who will have a key role in building our core product as we bring it to market. You'll be responsible for adding new features to our data pipeline, connecting it to our machine learning systems, databases and external services. You will also be responsible for creating and maintaining our B2B APIs, to enable our clients to push and pull data from our data pipeline.
Current stack: Python 3, PostgreSQL, RabbitMQ, CircleCI, Docker, Kubernetes, AWS
What you'll be doing
- Develop data and model storage systems
- Coordinate with both the business product and user product teams to ensure high quality engineering in an agile environment
- Scale and maintain our message-based data processing pipeline
- Extend and build upon our existing B2B APIs
- Develop web scraping architecture
- Manage integrations with other platforms and products
What we're looking for
- Proactive and positive attitude with ability to own projects and take work into your own hands
- Passionate about writing clean, well tested, and maintainable code
- PostgreSQL (or MySQL) experience
- 5+ years work experience
- Extensive Python experience
- Experience with distributed systems
- Experience with Docker
- Experience developing distributed microservices in Python
Bonus points for experience with:
- Building complex data processing pipelines
- Building scalable REST APIs
- Experience with CI tools such as CircleCI
- Real time distributed systems processing
- Data pipelines that can handle >1m URLs per hour
Please note the role is based in London, and is an on-site role. You must have existing eligibility to work in the United Kingdom.