1
0
Fork 0
mirror of https://gitlab.com/SIGBUS/nyaa.git synced 2025-01-10 15:44:09 +00:00
Commit graph

11 commits

Author SHA1 Message Date
queue eceb8824dc sync_es: fix flush_interval behavior during slow times
instead of flushing every N seconds, it flushed N seconds after
the last change, which could drag out to N seconds * M batch size
if there are few updates. Practically this doesn't change anything
since stuff is always happening.

Also fix not writing a save point if nothing is happening. Also
practically does nothing, but for correctness.
2017-05-28 20:14:14 -06:00
queue 33852a55bf sync_es: die when killed 2017-05-28 20:02:20 -06:00
TheAMM 9cd6c506ae Update ElasticSeach index and scripts for comment_count 2017-05-26 16:12:47 +03:00
nyaadev 152e547ac5 Add flask-Migrate + alembic for automated database migrations.
Update some dependencies to their latest version.
Make executable scripts executable (chmod +x).
2017-05-21 17:47:16 +02:00
queue ea2160a49d sync_es: move io to separate threads, config json
throughput is definitely massively improved, testing locally.
hopefully it'll be enough.

config moved a separate file by ops request. lazy lazy
2017-05-21 00:55:19 -06:00
queue 6a4ad827c1 sync_es: instrument with statsd, improve logging
also fixed the save time loop and spaced it out
to 10k events instead of 100.

Notably, the event no. of rows caps out at around 5 by default
because of default -binlog-row-event-max-size=8192 in mysql; that's
how many (torrent) rows fit into a single event.

We could increase that, but instead I think it's finally time to finally
multithread this thing; both the binlog read and the ES POST shouldn't
use the GIL so it'll actually work.
2017-05-20 23:19:35 -06:00
aldacron f27cf17478 added timeout to import and sync es 2017-05-16 23:15:48 -07:00
queue e38fe2575a sync_es.py: bulk actions per binlog event
mainly helps with the stat updates, that come in
a single INSERT VALUES (...) ON CONFLICT UPDATE event,
which helpfully translates to a bulk index event.

It seems like elasticsearch should still be buffering that up
internally, so maybe the refresh_interval: 30s change will help
more than this.
2017-05-16 22:47:34 -06:00
aldacron 40c34e7df0 add stats as upsert in case of binlog not being sequential 2017-05-16 03:20:38 -07:00
aldacron 899aa01473 hooked up ES... 90% done, need to figure out how to generate magnet URIs 2017-05-15 23:51:58 -07:00
queue 32b9170a81 es: add sync_es script for binlog maintenance
lightly documented.
2017-05-15 01:32:56 -06:00