Optimising full-text queries in the amaGama translation memory server

  Рет қаралды 251

PostgresConf South Africa

PostgresConf South Africa

Күн бұрын

Friedel Wolff
PostgresConf South Africa 2019
postgresconf.org/conferences/...
The amaGama project implements a FOSS translation memory web service built with Python on top of PostgreSQL. I recently worked on improving its performance, and would like to report on what I did and how I did it. The presentation will cover how an understanding of the problem domain, usage patterns and algorithms involved allowed for a big performance improvement despite some (arguable) shortcomings of PostgreSQL.
A translation memory contains texts and their translations. The amaGama service hosts such a database of translations of many FOSS packages in many languages, such as GNOME, KDE, PostgreSQL, Mozilla, LibreOffice, etc. The web service is typically queried by a tool for computer assisted translation, and responds with similar translations done before in the language pair of interest. The suggestions are meant to help translators work faster and with higher quality.
Response time is important to actually help users, and before, amaGama would sometimes take multiple seconds to respond to certain queries - far too long to be helpful. The database schema features two simple tables with a full-text index (GIN) used to perform fuzzy matching on the previous texts. An analysis of query plans indicated bad row estimates, and arguably bad query plans. VACUUM and collecting more statistics did not improve things.
Part of my solution involved a combination of CLUSTER with partially overlapping partial indexes. The partial indexes helped to address some shortcomings relating to full-text indexing with GIN, and combining it with CLUSTER ensured that disk I/O could be reduced for many queries. The median, average and worse case times for queries were reduced to as little as 40% of their previous times.

Пікірлер
Designing for Accessibility
41:34
PostgresConf South Africa
Рет қаралды 108
From models to hosted OpenAPI Specification (OAS)
40:31
PostgresConf South Africa
Рет қаралды 449
How Many Balloons Does It Take To Fly?
00:18
MrBeast
Рет қаралды 207 МЛН
小宇宙竟然尿裤子!#小丑#家庭#搞笑
00:26
家庭搞笑日记
Рет қаралды 15 МЛН
Doing This Instead Of Studying.. 😳
00:12
Jojo Sim
Рет қаралды 20 МЛН
PostgreSQL/PostGIS devops with Docker and Rancher
35:53
PostgresConf South Africa
Рет қаралды 1,9 М.
What's all this fuss about Common Table Expressions (CTE's) anyway?
30:50
PostgresConf South Africa
Рет қаралды 2,9 М.
Temporal Journey
40:38
PostgresConf South Africa
Рет қаралды 684
Hacking with Postgres 11 - pg_threads
42:36
PostgresConf South Africa
Рет қаралды 828
I've been using Redis wrong this whole time...
20:53
Dreams of Code
Рет қаралды 347 М.
Transport Layer Security (TLS) - Computerphile
15:33
Computerphile
Рет қаралды 474 М.
Is JSONB a Silver Bullet
31:15
PostgresConf South Africa
Рет қаралды 1,7 М.
Microservices with Databases can be challenging...
20:52
Software Developer Diaries
Рет қаралды 22 М.
Event Stores and Postgres
36:39
PostgresConf South Africa
Рет қаралды 2 М.
Tips and tricks for speeding up PostgreSQL in an automated testing environment
33:56
PostgresConf South Africa
Рет қаралды 1 М.
How Many Balloons Does It Take To Fly?
00:18
MrBeast
Рет қаралды 207 МЛН