About the Lead Data Scientist (NLP & GenAI) Job Role
You will help our clients solve real-world problems by tracing the data-to-insights lifecycle:
- Understand business problems, making sense of the data landscape & footprint, performing a combination of Gen AI , Advanced NLP, exploratory analysis
- Create, experiment with and deliver innovative solutions in a consultative mindset to client stakeholders using textual data
- Guide team of data scientists to offer exceptional solutions to clients, across domains.
Work Location: Hyderabad/Bangalore (Hybrid mode with 3 days in office)
Qualification and experience for the Lead Data Scientist (NLP) Role:
- Background in Computer Science/Computer Applications or any quantitative discipline (Statistics, Mathematics, Economics/Operations Research etc.) from a reputed institute.
- 5-8 years of experience using analytical tools/languages like Python on large-scale data
- Must have Semantic model & NER experience
- Experience working with pre-trained models, awareness of state-of-art in embeddings and applicability for use cases
- Must have strong experience in NLP/NLG/NLU applications using any popular Deep learning frameworks like PyTorch, Tensor Flow, BERT, GPT (or similar models)
- Demonstrated ability to engage with client stakeholders at multiple levels and provide consultative solutions across different domains
- Deep knowledge of techniques such as Linear Regression, gradient descent, Logistic Regression, Forecasting, Cluster analysis, Decision trees, Linear Optimization, Text Mining
- Strong understanding of integrating NLP models into business workflows. Prospect should have exposure to project initiation to business impact creation in at least one project.
Experience in productionizing & retraining models:
- Ability to guide and mentor teams of associates on solution development and approaches
- Broad knowledge of fundamentals and state-of-the-art in NLP and machine learning
- Coding skills in one or more programming languages such as Python, SQL
- Expert / high level of understanding on language semantic concepts & data standardization
- Proven track record of successful models and practical implementation
- Experience in training transformer-based language models and their variants (T5, BART, BERT etc)
- Knowledge of transformer architecture and the impacts of modifying the same
- Familiar with multiple evaluation metrics fore LLMs
- Experience with Huggingface, Langchain etc., building the pipelines
- Experience with Vector DBs, Text embedding models
- Different prompting templates Zero-shot, Few-shot, Composition etc.
- LLM In-context learning , Fine tuning, Model evaluation metrics etc.
- Text pre/post - processing techniques
- Experience in using GPUs to train deep learning models
- Good knowledge of solving industrial problems using deep learning models with NLP-related use-cases
- Familiar with all prompting techniques
- Hands-on experience with popular ML frameworks such as Pytorch (must), TensorFlow
- Experience with Production deployment of LLM solutions
- Building scalable LLM solutions
- Familiarity with any Cloud services such as Azure ML studio, AWS Sage Maker etc. is considered a plus
- Knowledge in Machine Learning techniques in entity resolution, common speech products or text search domain
Didn’t find the job appropriate? Report this Job