Stefan Mićić,诺维萨德开发商,伏伊伏丁那,塞尔维亚
Stefan is available for hire
Hire Stefan

Stefan Mićić

Verified Expert  in Engineering

Python数据工程师和开发人员

Location
诺维萨德,伏伊伏丁那,塞尔维亚
Toptal Member Since
July 20, 2022

Stefan是一位经验丰富的机器学习和机器学习操作(MLOps)工程师,拥有大数据系统的实践经验. 他拥有十年的专业知识,还拥有人工智能硕士学位. Stefan研究过物体检测等问题, classification, sentiment analysis, 命名实体识别, 推荐系统. 他一直期待参与端到端机器学习项目.

Portfolio

RhythmScience Inc.
机器学习,Python, Keras, PyTorch,深度学习,Scikit-learn...
PlusPower
Python 3, Amazon SageMaker, Amazon Web Services (AWS), Docker, Bitbucket...
Cumulus Technologies LLC
人工智能,机器学习,Python...

Experience

Availability

Part-time

Preferred Environment

PyCharm, Python 3, Python, GitHub, Amazon S3 (AWS S3), JSON,分布式系统

The most amazing...

...我创建的端到端机器学习解决方案多次优化了机器学习管道的成本,并获得了最先进的结果.

Work Experience

Machine Learning Engineer

2023 - PRESENT
RhythmScience Inc.
  • 确定数据库和各种类型的文件(HL7), XML, 和PDF)的HIPAA标准,码头和自动化的整个管道.
  • 开发ML算法生成文本和分类PDF报告.
  • 设计、实现和部署解决方案.
Technologies: 机器学习,Python, Keras, PyTorch,深度学习,Scikit-learn, 自然语言处理(NLP), GPT, 生成预训练变压器(GPT), Data Integration

Senior MLOps Engineer

2023 - 2024
PlusPower
  • 使用Sagemaker开发大型ML管道,包括预处理, training, evaluation and deployment.
  • 开发了能够根据dag的配置和自动部署生成气流管道的管道.
  • 将测试覆盖率从15%提高到80%,并添加了集成测试,这样我们就可以在本地测试sagemaker管道.
Technologies: Python 3, Amazon SageMaker, Amazon Web Services (AWS), Docker, Bitbucket, DocumentDB, Grafana, Datadog, Terragrunt, Apache Airflow, Pytest, Terraform, Infrastructure

AI Lead via Toptal

2023 - 2024
Cumulus Technologies LLC
  • 在AWS上创建了整个CI/CD管道. 从数据摄取、处理、模型训练到模型部署的一切都是自动化的.
  • 使用各种AWS服务(如Lambda)设计并领导整个ML管道的实现, Polly, and SageMaker.
  • 利用AWS进行开发,以满足高安全性需求(AWS Cloud9), AWS CodeCommit, and AWS CodePipeline).
Technologies: 人工智能,机器学习,Python, Amazon Web Services (AWS), Amazon SageMaker, 机器学习操作(MLOps), Hyperledger Fabric, 谷歌云平台(GCP), SQL, PostgreSQL, Database Migration, 大型语言模型(llm), Models, Unit Testing, English, 生成式人工智能(GenAI), Language Models, Stock Trading, Algorithmic Trading, Finance, Financial Software, Trading Systems, OpenAI, Prompt Engineering, 检索增强生成(RAG), OpenAI GPT-3 API, OpenAI GPT-4 API, APIs, Speech Recognition, System Architecture, Infrastructure, Google Cloud

MLOps Engineer

2023 - 2023
NewsCorp
  • 执行不同LLM和稳定扩散模型的部署.
  • 致力于llm的延迟和成本优化. 使用不同的部署技术成功地将延迟减少了五倍.
  • 负责整个ML部件的完整部署过程和文档维护.
Technologies: Amazon EC2, GitHub, Docker, Deep Learning, Models, Unit Testing, English, Query Optimization, Language Models, 检索增强生成(RAG), APIs, Infrastructure

MLOps Engineer

2022 - 2023
PepsiCo Global - DPS
  • 使用PySpark机器学习管道实现端到端管道.
  • 使用GitHub操作实现了单元和集成测试的CI/CD.
  • 实现了处理大容量数据(150tb)的Spark和scikit-learn/Pandas ETL作业.
技术:机器学习操作(MLOps), APIs, Machine Learning, Python, Databricks, Big Data, Spark, Scikit-learn, Pandas, CI/CD Pipelines, REST APIs, ETL, Models, Unit Testing, Data Processing, English, Query Optimization, MLflow, Data Analytics, Infrastructure

Tech Lead Data Engineer

2022 - 2023
Motius
  • 带领一个小团队实现ELT管道,从GraphQL数据库获取数据并将其放入Azure SQL中. 所有内容都被Dockerized并推送到Azure映像注册表.
  • 使用PySpark实现KPI计算,PySpark与Snowflake通信. 为Snowflake定义了表模式,并创建了迁移脚本.
  • 遵循Scrum方法论,包括每日Scrum、复古和计划,并使用Jira.
  • 带领一个小团队使用Apache气流作为编排器实现ETL Spark作业, AWS是基础设施,Snowflake是数据仓库.
Technologies: Spark, Apache Spark, PySpark, Snowflake, Python, Python 3, Amazon Web Services (AWS), Databases, Distributed Systems, Azure SQL, Azure, AWS Glue, Apache Airflow, Software Architecture, Data Pipelines, Data Analysis, CI/CD Pipelines, Database Migration, Data Engineering, ETL, Unit Testing, Data Processing, English, Query Optimization, Data Analytics, Data Integration, ELT, DataOps

MLOps Engineer

2021 - 2022
Lifebit
  • 使用量化进行深度学习模型优化, ONNX Runtime, and pruning, among others.
  • 监控模型性能,包括内存、延迟和CPU使用情况.
  • 使用Valohai自动化CI/CD过程,使用GitHub Actions自动化MLOps生命周期的某些部分.
  • 使用Amazon CloudWatch创建了自动实验跟踪, Valohai, Python, GitHub Actions, and Kubernetes.
Technologies: Amazon EC2, Valohai, Keras, TensorFlow, Python 3, Lens Studio, Kubernetes, Codeship, GitHub, 开放神经网络交换(ONNX), Visual Studio Code (VS Code), Optimization, Neural Networks, NumPy, Monitoring, Amazon S3 (AWS S3), Cloud, Scikit-learn, Amazon Web Services (AWS), AI Design, Deep Neural Networks, Software Engineering, Pytest, JSON, Source Code Review, Code Review, Task Analysis, Databases, Data Science, CI/CD Pipelines, DevOps, REST APIs, Models, Unit Testing, English, Language Models, APIs, Amazon SageMaker, Terraform, Celery, Infrastructure, Ray.io

Machine Learning Engineer

2020 - 2021
HTEC Group
  • 使用Open Neural network Exchange (ONNX)优化已经在训练网络上的机器学习编译器,而无需重新训练,并使用PyTorch和c++实现自定义操作符.
  • 致力于Android机器学习解决方案,并指导经验不足的开发人员训练和准备对象检测器和分类器,以便在Android设备上顺利运行.
  • 增强了一个旨在将图像升级到尽可能完美的4K分辨率的项目.
  • 参与船舶路线的SDP问题. 从零开始实现了一个算法来引导船只. 油耗和预计到达时间被用于计算.
  • 致力于开源ONNX运行时,以增加对MIGraphX库的支持.
Technologies: Python 3, Python, Docker, Computer Vision, PyTorch, 人工智能(AI), Machine Learning, Team Leadership, 机器学习操作(MLOps), GitHub, 卷积神经网络(CNN), 开放神经网络交换(ONNX), Visual Studio Code (VS Code), Neural Networks, NumPy, Cloud, Pandas, Scikit-learn, 计算机视觉算法, AI Design, Deep Neural Networks, Software Engineering, Pytest, JSON, Technical Hiring, Source Code Review, Code Review, Task Analysis, Interviewing, Databases, Data Science, REST APIs, Models, Unit Testing, English, Language Models, Research, APIs

Machine Learning Engineer

2019 - 2020
SmartCat
  • 使用MLflow进行模型版本控制,为完成MLOps生命周期做出贡献, 用于数据版本控制的LakeFS, AWS S3 for data storage, 和TensorFlow在Docker中服务.
  • 作为数据工程师,使用Apache Spark完成ETL作业,使用Prefect和Apache Airflow进行调度.
  • 训练了几种不同的目标检测和分类体系结构.
Technologies: Python 3, Scala, Python, Docker, SQL, Computer Vision, MongoDB, 人工智能(AI), Machine Learning, Data Engineering, 机器学习操作(MLOps), GitHub, 递归神经网络(rnn), 卷积神经网络(CNN), ETL, Visual Studio Code (VS Code), Neural Networks, NumPy, Amazon S3 (AWS S3), Big Data, Image Processing, Cloud, Pandas, Scikit-learn, Object Detection, 计算机视觉算法, Object Tracking, Apache Spark, Amazon Web Services (AWS), AI Design, Deep Neural Networks, Software Engineering, Pytest, ETL Tools, JSON, Jupyter Notebook, Source Code Review, Code Review, Task Analysis, PySpark, Databases, Data Science, Distributed Systems, Data Pipelines, REST APIs, Models, Unit Testing, Data Processing, English, MLflow, APIs, Amazon SageMaker, Prefect

Machine Learning Engineer

2016 - 2019
Freelance
  • 从各个网站搜集欧博体育app下载, 然后使用自然语言处理-长短期记忆(LSTM)对抓取的数据进行分析和准备。, Word2Vec, 和转换器——因为数据是塞尔维亚语,所以添加了NER.
  • 使用Amazon SageMaker实现机器学习流水线数据预处理自动化, model training, and deployment. 执行模型的自动再培训和部署, 在客户端更新新数据之前完成机器学习过程.
  • 使用Apache Spark, Kafka, Hadoop和MongoDB进行大数据项目.
  • 作为数据工程师,使用Spark创建优化的ETL管道. 将客户的需求转换为SQL.
Technologies: Python 3, Spark, Amazon SageMaker, Python, Docker, Computer Vision, MongoDB, 人工智能(AI), Machine Learning, Data Engineering, Kubernetes, 机器学习操作(MLOps), GitHub, Amazon EC2, 递归神经网络(rnn), 卷积神经网络(CNN), 开放神经网络交换(ONNX), Recommendation Systems, 自然语言理解(NLU), GPT, 生成预训练变压器(GPT), 自然语言处理(NLP), Visual Studio Code (VS Code), Time Series, Data Modeling, Data Mining, Neural Networks, NumPy, Amazon S3 (AWS S3), Big Data, Apache Kafka, Hugging Face, Transformers, Cloud, Pandas, Scikit-learn, Object Detection, 计算机视觉算法, Apache Spark, Amazon Web Services (AWS), AI Design, Web Development, Deep Neural Networks, Software Engineering, Pytest, JSON, Jupyter Notebook, Source Code Review, Code Review, Task Analysis, PySpark, Databases, Data Science, Distributed Systems, Project Management, CI/CD Pipelines, 谷歌云平台(GCP), DevOps, REST APIs, Models, Unit Testing, English, MLflow, APIs

自动化端到端(E2E)计算机视觉解决方案

创建了一个实时执行几件事的系统,包括:
•检测房间中的物体
•分类人的姿势
•自动再培训(主动学习)
•模型和数据版本控制
• Dockerized pipeline
利用这些模型和预测, 我们创建了一个后处理管道,用于为客户创建报告或关键绩效指标(kpi).

Android COVID-19测试分类

目标是创建一个COVID-19测试分类模型. 我们的数据集很小,必须在最短的时间内(两周)建立最佳模型。.
我在这个项目上领导了一个两个人的团队. 我们使用MobileNet是因为它的规模,所有与业务相关的指标都很棒. 我们使用了许多优化技术将模型部署到Android上, such as quantization, pruning, 知识的提炼.

MLOps Engineer

参与了一个项目,我的工作是使用量化优化整个机器学习系统, pruning, ONNX, and more. 我在减少五倍延迟的情况下达到了同样的精度, 缩小模型尺寸的两倍, 成本降低了四倍. 我还更改了底层EC2实例的类型,以获得更多的系统信息.

Image Super Resolution

目标是通过研究和开发SOTA研究论文中的方法来改进升级和超分辨率模型. 有很多不同的自定义损失函数, layers, metrics, 甚至自定义反向传播.

ETL Jobs

•创建批量ETL作业,用于计算kpi.
•优化解决方案,降低成本和计算时间.
•通过气流和Prefect计划作业.
技术栈是:Spark、Scala、AWS S3、Kafka、Apache气流和Prefect.

NLP Articles Processing

这个项目的目标是开发物品处理的两个阶段:
1. 找到所有相关的标签(事件、地点、名称等).) in the article.
2. 找到在某种程度上相关的标签对.

拥抱脸变压器主要用来解决这个问题(基于bert的模型). 总体指标高于95%.

Data Ingestion

领导一个团队,目标是从GraphQL数据库中获取数据并将其插入Azure SQL. 每次推送到GitLab的主分支时,所有内容都被Dockerized并推送到EKS. 为了优化解决方案,使用了并发线程.

DE项目的技术领导

我的职责是从架构到实现细节的所有决策. 我们使用AWS的基础设施(CloudWatch、Glue、S3)和Airflow来编排Spark作业. Spark作业的每个结果都保存到Snowflake.
2020 - 2021

人工智能硕士学位

诺维萨德大学-诺维萨德,塞尔维亚

JULY 2022 - JULY 2025

AWS认证机器学习-专业

Amazon Web Services

Libraries/APIs

PyTorch, Keras, NumPy, Scikit-learn, REST api, TensorFlow, Pandas, PySpark, Terragrunt

Tools

PyCharm, Amazon SageMaker, GitHub, Apache Airflow, Pytest, Codeship, AWS Glue, Bitbucket, Grafana, Terraform, Celery

Frameworks

Spark, Apache Spark, Streamlit

Languages

Python 3, Python, SQL, Scala, Java, Snowflake, GraphQL, c++

Paradigms

数据科学,ETL,单元测试,DevOps

Platforms

Amazon Web Services (AWS), Jupyter Notebook, Visual Studio Code (VS Code), Docker, Kubernetes, Amazon EC2, Apache Kafka, Azure, Databricks, 谷歌云平台(GCP), Hyperledger Fabric, Kubeflow

Storage

Amazon S3 (AWS S3), JSON, Databases, PostgreSQL, NoSQL, MongoDB, Data Pipelines, Database Migration, Data Integration, Azure SQL, Datadog, Google Cloud

Industry Expertise

交易系统,项目管理

Other

Deep Learning, Machine Learning, 人工智能(AI), Data Engineering, Computer Vision, 自然语言处理(NLP), 自然语言理解(NLU), 卷积神经网络(CNN), 递归神经网络(rnn), 机器学习操作(MLOps), Neural Networks, AI Design, Deep Neural Networks, Software Engineering, Technical Hiring, Source Code Review, Code Review, Task Analysis, Interviewing, APIs, GPT, 生成预训练变压器(GPT), 大型语言模型(llm), Models, Data Processing, English, 生成式人工智能(GenAI), Language Models, MLflow, OpenAI, Recommendation Systems, 开放神经网络交换(ONNX), Lens Studio, Optimization, Team Leadership, Valohai, Time Series, Data Modeling, Data Mining, Monitoring, Big Data, Image Processing, Transformers, Cloud, Object Detection, 计算机视觉算法, Object Tracking, Web Development, Speech Recognition, Voice Recognition, Cloud Services, ETL Tools, Distributed Systems, Data Analysis, CI/CD Pipelines, Query Optimization, Research, Stock Trading, Algorithmic Trading, Finance, Financial Software, Prompt Engineering, 检索增强生成(RAG), OpenAI GPT-3 API, OpenAI GPT-4 API, Prefect, Data Analytics, ELT, System Architecture, Infrastructure, DataOps, Hugging Face, BERT, Back-end, Software Architecture, DocumentDB, Ray.io

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

开始你的无风险人才试验

与你选择的人才一起工作,试用最多两周. 只有当你决定雇佣他们时才付钱.

对顶尖人才的需求很大.

Start hiring