Remote jobs for honest people.

Peace of mind place to find good remote work.

Data Engineer

Doximity is the leading social network for healthcare professionals with over 75% of U.S. doctors as members. We have strong revenues, profits, real market traction, and were putting a dent in the inefficiencies of our $2.5 trillion U.S. healthcare system. After the iPhone, Doximity is the fastest adopted product by doctors of all time. Our founder, Jeff Tangney, is the founder & former President and COO of Epocrates (IPO in 2010), and Nate Gross is the founder of digital health accelerator RockHealth. Our investors include top venture capital firms who've invested in Box, Salesforce, Skype, SpaceX, Tesla Motors, Twitter, Tumblr, Mulesoft, and Yammer. Our beautiful offices are located in SoMa San Francisco.You will join a small team of data infrastructure engineers (4) to build and maintain all aspects of our data pipelines, ETL processes, data warehousing, ingestion and overall data infrastructure. We have one of the richest healthcare datasets in the world, and we're not afraid to invest in all things data to enhance our ability to extract insight.##Job Summary- Help establish robust solutions for consolidating data from a variety of data sources.- Collaborate with product managers and data scientists to architect pipelines to support delivery of recommendations and insights from machine learning models.- Build and maintain efficient data integration, matching, and ingestion pipelines.- Establish data architecture processes and practices that can be scheduled, automated, replicated and serve as standards for other teams to leverage.- Build instrumentation, alerting and error-recovery system for the entire data infrastructure.- Spearhead, plan and carry out the implementation of solutions while self-managing.- We expect you to be very comfortable around Unix, Git, and AWS.##Experience & Skills- At least three years of professional experience developing data infrastructure solutions.- Fluency in Python and SQL.- Experience building data pipelines with Spark and Kafka.- Passion for clean code and testing with Pytest, FactoryBoy, or equivalent.- Comprehensive experience with Unix, Git, and AWS tooling.- Astute ability to self-manage, prioritize, and deliver functional solutions.##Preferred Experience & Skills- Experience with MySQL replication, binary logs, and log shipping.- Experience with additional technologies such as Hive, EMR, Presto or similar technologies.- Experience with MPP databases such as Redshift and working with both normalized and denormalized data models.- Knowledge of data design principles and experience using ETL frameworks such as Sqoop or equivalent.- Experience designing, implementing and scheduling data pipelines on workflow tools like Airflow, or equivalent.- Experience working with Docker, PyCharm, Neo4j, Elasticsearch, or equivalent.##Our Data Stack- Python, Kafka, Spark, MySQL, Redshift, Presto, Airflow, Neo4j, Elasticsearch##Fun Facts About the Team- We have access to one of the richest healthcare datasets in the world, with deep information on hundreds of thousands of healthcare professionals and their connections.- Business decisions at Doximity are driven by our data, analyses, and insights.- Hundreds of thousands of healthcare professionals will utilize the products you build.- Our R&D team makes up about half the company, and the product is led by the R&D team.

via Remote OK

Apply for this job








machine learning








Dec 06th 2017 (5 days ago)

Remote OK

Apply for this job
Help us maintain the quality of jobs posted on Honestlance. Please let us know if there is an issue with this job.