Fork me on GitHub

July 22–23

PyCon Russia 2018

Рус Eng

Dmitry Khodakov, Avito

Tornado vs Aiohttp

We have a lot of microservices in kubernetes to apply machine learning in production in Avito, largest classified site in Russia.

When you have several dozens of highloaded microservices on your hands, you will inevitably think about the performance of asynchronous applications.

In my talk i'll cover

— Typical problems and pitfalls in the python microservice framework development

— Profiling of asynchronous applications

— Fundamental differences in async tornado and async aiohttp

— Tornado vs aiohttp performance in real world

Vadim Pushtayev, Mail.Ru

Autotests in the projects Mail.Ru

This talk is about how we do unit-testing in Python. From little things like naming test methods to the more significant problems like mocking subsystems, how to (or not to) use TDD, what to do with fixtures and more.

Serg Karpovich, Vadim Berezkin, mos.ru

How to make a user-friendly search engine

Yandex and Google demonstrate a high level of usability and search quality and they give an answer with the user's intent. Existing free search engines cannot search like Yandex and Google, they cannot meet all the needs of business and users, you should configure the search engine, and refine the search engine infrastructure. We will talk about the available tools and ways to customize the usability, quality, and relevance of internal search using the example of Elasticsearch, Python.

The report will be useful to developers of search engines for sites and portals.

Vitaly Davydov, POTEHA DEVELOPERS

Serverless + Python using AWS Lambda as an example

Nowadays the most tech way to deploy web apps is using Docker containers. Usually to manage containers' cluster company needs special people — DevOps. A couple of years ago the new technology called Serverless was released. This technology almost completely illuminates devops layer and allows developers to easily deploy their apps with a focus on business logic. Using concept FAAS (function-as-a-service) Serverless is capable of provisioning not only MVPs' and prototypes but also high load services with tiny costs.

The talk will cover core aspects of developing Serverless apps, costs, top market solutions and a short live demo with AWS Lambda at the end of the presentation. The talk will be useful for startups, small to medium businesses who'd like to optimize costs for development and time-to-production for new products.

Evgeniy Slezko, Marilyn System

Switching an existing Python project to SOA. Hard or nothing?

I’d like to share our experience of implementing service-oriented architecture in the existing application which is being developed in Python for over 5 years. Why would you need this? What problems does it solve and what problems does it create? What should you take into account before the start? What benefits does it have for a software engineer and for a manager?

The presentation is for experienced software developers, system architects and managers of different levels who either already faced or will face the problem of SOA implementation in their projects.

After transition to SOA the complexity doesn’t disappear, but rather moves to another level. Fortunately, on that level most of the potential problems have already been solved. After eliminating complexity at business logic level we can focus on the problem itself.

Christian Heimes, Red Hat

SSLError, now what?

TLS/SSL is the most important and widely-used protocol for secure and encrypted communication, e.g. HTTPS. It offers more than just encryption. TLS also ensures data integrity and strong authentication with X.509 certificates. But it provides merely a false sense of security if you use it wrong.

Have you ever encountered SSLError while connecting to a server, but you didn’t understand what is going on? Are you running production code without TLS/SSL protection or with certificate validation disabled, because you couldn’t figure out how to make it work correctly?

I’ll give you the rundown of the basic cryptographic building blocks, protocol handshake, inner structure of certificates, and PKI. You’ll learn about the best practices, debugging tools and tips how to diagnose TLS/SSL and how to deal with certificates.

This talk is suitable for both beginner and advanced Pythonistas.

Alexander Menshikov, Laboratory of Intelligent Systems Engeneering

Robotics with Python and ROS

Valentin Malykh, Alexey Lymar, MIPT

Workshop «DeepPavlov: open-source python library for dialog systems»

On iPavlov.ai workshop you'll learn how to make your own chat-bot (e.g. for Telegram or any other IM-service). We'll show how to handle data for dialog system training and how ещ use already existing models from DeepPavlov library.

Melanie Warrick, Google

Reinforcement Learning

Reinforcement learning is a popular subfield in machine learning because of its success in beating humans at complex games like Go and Atari. The field’s value is in utilizing an award system to develop models and find more optimal ways to solve complex, real-world problems. This approach allows software to adapt to its environment without full knowledge of what the results should look like. This talk will cover reinforcement learning fundamentals and examples to help you understand how it works.

Marina Kamalova, Yandex

Chat bot architecture

In this talk, I will explore creating chatbots in Python. Topics will include:

  • Python libraries and frameworks doing lots of "boring" stuff for you: from API adapters to natural language processing and generation.
  • Adapting the chatbot for other messengers (and non-messengers alike).
  • Scalability and reliability issues, with a special focus on Telegram API.

The talk is aimed for anyone interested in making chatbots for fun and profit. It may also serve as a showcase for doing natural language processing in Python.

Andrii Soldatenko, Toptal

Competitive programming using Python

Competitive programming is really fantastic discipline to improve your programming and math skills. The idea is very simple given well-known computer science problems, solve them as quickly as possible. From another point, these challenges have been used frequently in the initial coding interview tasks. Usually all participants uses C/C++/Java, but last decade we can see increasing trend of contestants who uses Python in programming challenges.

I this talk I’ll show you how to start compete using Python, I’ll share couple personal tips & tricks how to prepare and start to take part into programming contests using Python. I’ll show you how to motivate yourself to practise and how to define of class of problem and how to try to solve it. I’ll demonstrate you some limits of Python and how to avoid it to make correct and fast solutions. I’ll discuss with you how to master the art of testing and how to crack and generate hidden test cases and boundaries. Also I’ll explain how to quickly estimate complexity of you solution, without too many proofs and maths to get desired AC (Accepted).

It takes a long time to become a good competitive programmer, but it is also an opportunity to learn a lot.

Kate Heddleston, Shift

Technical Debt in Python

Technical debt is something that every team has to deal with at some point; often sooner than you'd think. This talk is about what is technical debt, what specifically does it look in python, and how can you think about reducing technical debt.

Yury Selivanov, Andrew Svetlov, Christian Heimes

Core Development Panel

Yury Selivanov, EdgeDB, asyncio

Asyncio: today and tomorrow

Alejandro Saucedo, Eigen Technologies

Industrial Data Pipelines with Python and Airflow

This talk will provide a practical deep dive on how to build industry-ready machine learning / data pipelines using Airflow. This will consist of a practical presentation that will build from the basics of Airflow, and show how it is possible to build scalable and distributed machine learning data pipelines using a distributed architecture with a Celery backend. I will provide insights on some of the key learnings I've obtained throughout my career building & deploying machine learning systems in critical environments across several sectors.

Stephan Jaensch, Yelp

Type annotations with large(r) codebases

You've heard about type annotations, you know they help reduce bugs and improve documentation especially for large codebases, and you've attended an introductory talk or read a tutorial about using them. But how do you get started using them with your big, existing codebase? How do you make sure your colleagues will be annotating new code they write - or existing code they're changing? And how do you get around some of the issues you might run into when using the still-beta type checker mypy on your codebase?

This talk will start where the typical introductory Python type annotation talks end and discuss the real-world challenges when starting to annotate types with an existing codebase of tens or hundreds of thousands of lines of code. I'll walk you through best practices learned from doing just that at Yelp, telling you about some of the roadblocks we hit (and how we got past them). We'll also take a look at:

— how you can get the most out of type annotations even with non-annotated third-party libraries

— how to deal with decorators and other things that currently don't always work well with annotations

— when the only way to get proper type checking is through refactoring your code.

Denis Kataev, Tinkoff

SQLAlchemy: Python vs Raw SQL

Elena Nikitina, System

Sergei Borisov, Domclick.ru

Workshop «Cython - C for humans»

Python has excellent facilities for integration with C code. This allows you to optimize performance-critical functions at a low cost while maintaining flexibility. Sometimes this is easier than rewriting the entire service to Go / Rust / . I will show you what tools can be used to solve this kind of problems, and together we will create an asynchronous client with a simple protocol.

Maxim Mazaev, CIAN

How to maintain the consistency of the API in a micro-service architecture

I'll talk about how we manage a lot of microservices in CIAN and how we deal with the typical problems of their support - versioning and API consistency. How to control consistency with a CI system. How we use a code generation and swagger-schemes. This talk will be useful for experienced developers who interested in microservice architecture.

Donald Whyte, Engineers Gate

High Performance Data Processing in Python

The Internet age generates vast amounts of data. Most of this data is unstructured and needs to cleaned. Python has become the standard tool for transforming this data into more useable forms.

numpy and pandas are the most popular Python libraries for processing large quantities of data. For small datasets, these libraries do the job without much effort. However, when running complex transformations on larger datasets, many developers fall into common pitfalls that kill the performance of these libraries.

This talk explains how numpy and pandas work under the hood and how they use vectorisation to process large amounts of data extremely quickly. We show an example dataset being processed using numpy/pandas. We demonstrate how to use these libraries effectively, reducing the processing time of this large dataset from several hours to seconds.

Mikhail Korobov, ScrapingHub

Machine Learning in Web Scraping and Web Crawling

Everyone knows how to do web scraping and web crawling with Python: download and automate web pages using Scrapy / Selenium / requests; extract structured information with XPath, CSS, BeautifulSoup selectors or regular expressions. But this approach falls short when you want to extract information from millions of websites; just rules and heuristics can't get you there.

So, the talk is about applications of Machine Learning in web scraping and web crawling tasks:

  • how to classify web pages;
  • how to make computers "understand" page elements: html forms, pagination, etc.;
  • how to extract structured information from a web page;
  • how to create smart crawlers which doesn't download unnecessary content: duplicates, off-topic pages.

I'll use examples from my practice to illustrate the solutions, describe applications of Deep Learning and Reinforcement Learning in web crawling and web scraping, and point to various Open Source components for creating "smart" web spiders.

Andrey Vlasovskikh, JetBrains

7 advices for editing the code in PyCharm

Andrew Svetlov, Python Core Developer

aiohttp from author

Alexey Kuzmin, Domclick.ru

Ling Zhang, Aiden.ai

NLP to Discover Rich Insights from Massive Noisy Text

In this talk, I present a case study of how we extracted rich, actionable insights from a large noisy corpus of unstructured survey responses for a government entity. We reduce time to analysis from months to minutes. We use scikit-learn and NLTK to explore techniques such as clustering, natural language understanding, and summarization, and go over both practical methods and the underlying theory.