1
0
Fork 0
mirror of https://gitlab.com/SIGBUS/nyaa.git synced 2024-11-01 04:25:54 +00:00
Commit graph

577 commits

Author SHA1 Message Date
TheAMM 9c3ac4dc67 ES: implement real substring matching
...by splitting input into characters, instead of whitespace delimited
words. This means you can now match partial words, real substrings from
anywhere: "foo ba" will match "Foo Bar Baz", while previously you had to
have full words ("foo bar") to match anything.

My dev setup incurred an 8% increase in storage usage, from ~13MB to
~14MB (for ~40k torrents).
Small change, big improvement. Wonder why I didn't do this at first.
2018-06-07 21:36:41 +03:00
Edward Betts d407f09cab Correct spelling mistakes. (#495) 2018-05-28 04:54:54 -07:00
Kcchouette b999f8d39f Fix availiable → available (#491) 2018-05-14 02:36:26 -07:00
Nicolas F bb9a62f71b user page: add manual activation button for mods (#472)
* user page: add manual activation button for mods

Moderators can press this button on inactive users to manually
activate their accounts.

Furthermore, the admin form code has been refactored a bit, reducing
some code duplication.
2018-05-10 18:57:59 -07:00
Anna-Maria Meriniemi 59db958977 ES: delimit words before ngram, optimize tokens (#487)
Before, long.tokens.with.dots.or.dashes would get edgengrammed up to the
ngram limit, so we'd get to long.tokens.wit which would then be split -
discarding "with.dots.or.dashes" completely. The fullword index would
keep the complete large token, but without any ngramming, so incomplete
searches (like "tokens") would not match it, only the full token.

Now, we split words before ngramming them, so the main index will
properly handle words up to the ngram limit. The fullword index will
still handle the longer words for non-ngram matching.

Also optimized away duplicate tokens from the indices (since we rely on
boolean matching, not scoring) to save a couple megabytes of space.
2018-04-28 18:09:40 -07:00
Alex Ingram 8f4202c098 Improve mobile user experience. 2018-04-24 23:19:14 -05:00
Alex Ingram 8498ff8e62 Fix torrent index on narrow viewports. (#484)
people should get bigger monitors
2018-04-23 15:47:18 -07:00
tohsa 7a494f26de Small changes to README.MD and .gitignore (#476)
* added es_sync_config.json and minified js to .gitignore

* reordered README.md to reflect that MySQL Binlogging must be enabled before running import_to_es.py
2018-04-23 00:27:09 -07:00
Anna-Maria Meriniemi 0cc25c3569
[ES] Improve search term preprocessing to include literal groups (#477)
* Extend ES term preprocessing for OR groups

Implements handling "foo"|"bar" literal OR groups in the Elasticsearch
term preprocessor. Groups can be negated with -, but don't mesh with
precedence (like plain literals).

This is a partial hack, the real solution would be to parse the entire
search terms ourselves, with AND and OR groups, negations etc. But
having that work neatly with the simple_query_string would be bit of a
hassle.

* Update help.html search tips

since search (quoting strings) has changed a bit.
2018-04-15 09:53:36 +03:00
Anna-Maria Meriniemi 0b78428abc [ES Change] Improve Elasticsearch term quoting (#473)
* Optimize Elasticsearch fullword field

Since the main display_name field ngrams words up to 15 characters,
anything to and under that will already be indexed - the fullword field
(which we have for words longer than 15 characters) needs to index only
words longer than that.

* Preprocess ES terms for better literal matching

This commit adds a new .exact subfield to display_name, which holds a
barely-filtered version of the original title we can do "literal"
matching against. This is not real substring matching, but quoting
terms now actually does something!

Implements a simple preprocessor for the search terms to extract quoted
parts from the search terms, optionally prefixed with - to negate them.
The preprocessor will create a query that'll join all three query-types:
the simple_query_string, must-phrases and must-not-phrases.
2018-04-13 17:06:25 -07:00
nyaadev 8f9400bb5f Revert "[Schema change] Torrents flags bitflag column to indexed columns (#471)"
This reverts commit 41a2a32f66.

Performs worse in some cases than what we had before.
2018-04-08 08:36:42 +02:00
A nyaa developer 41a2a32f66 [Schema change] Torrents flags bitflag column to indexed columns (#471)
* convert torrent table flags column from bitflag to independent indexed columns

* elasticsearch integration (untested)

* improve performance
2018-04-07 22:44:53 -07:00
Nicolas F c786bd20f8 Fix nuke button prompt (#469)
Hitting the cancel button does not return "", but null. Therefore
the toLowerCase() fails, and throwing an exception means "sure go
ahead submitting this" to JS for some godforsaken reason.

Just remove the toLowerCase for now, have people type the names
properly.
2018-04-04 23:32:04 +02:00
Nicolas F 291f859a4f Use Flask-Assets to minify self-hosted JS files (#468)
* Use Flask-Assets to minify self-hosted JS files

By having Flask-Assets minify the two JS files we ship, namely
main.js and bootstrap-select.js, we can shave off 28406 bytes.

The minified files are generated on startup. If one wishes to
manually clean them up or build them, they can use the
"flask assets" management command, e.g. "flask assets clean".

* Workaround to fix tests

State carries over in tests, which is the dumbest shit ever. Fix it
by clearing the bundles before setting them.
2018-04-04 16:02:05 +02:00
Arylide 03094b6d36 Commit editing time 2018-04-02 13:18:39 -07:00
nyaadev f1bab93a94 fix two bugs and a minor issue 2018-04-02 22:06:41 +02:00
Nicolas F 60ce4ec3f1 Implement comment locking (#439)
* Implement comment locking

This adds a new flags to torrents, which is only editable by
moderators and admins. If checked, it does not allow unprivileged
users to post, edit or delete comments on that torrent.

* Rename "locked" to "comment_locked".

* Shorter button and additional words on alt text

* Admin log: Change comment locking message

dude I love bikeshedding xd

* Bikeshedding over admin log messages

* >&
Also some bikeshedding
2018-03-25 17:03:49 -07:00
Nicolas F 2b5f9922e9 Add rel="prev"/"next" attribs on pagination (#462) 2018-03-25 16:32:03 -07:00
Nicolas F ad5ea6d91e Use rel attributes on links in the info field (#463) 2018-03-25 16:30:57 -07:00
Nicolas F e9b1f6a6c4 Add some rel attributes to links inside markdown (#461)
I currently don't differentiate between "trusted" markdown and
untrusted, but this should be good enough. Basically tells the
browser not to send a referrer, and (not sure if relevant here)
not to expose a window opener object. Also tells search engines
that the link is not endorsed with "nofollow".
2018-03-25 16:30:07 -07:00
Nicolas F dd4510f371 Fix user page ban list styling for multiple bans (#460)
Change each ban to be a bullet point in an unordered list.
2018-03-25 16:27:28 -07:00
Nicolas F c405f49eb6 Redo nuke functionality (#459)
This started out as a simple rebase, but then I rebased the wrong
branches and it all got confusing, so here it is as a new dank
commit.

We now have an @admin_only decorator, and we ask for confirmation
before we nuke. We can also see the nuke button when users are
banned, and nuking is a separate endpoint with a separate form.

Additionally, it now uses the new tracker API.
2018-03-25 16:24:44 -07:00
TheAMM 81806d7bc9 Pad info_hash in ElasticSearch sync scripts
python-mysql-replication (or PyMySQL) would return less than 20 bytes
for info-hashes that had null bytes near the end, leaving incomplete
hashes in the ES index. Without delving too deep into the real issue
(be it lack of understanding MySQL storing binary data or a bug in
the libraries), thankfully we can just pad the fixed-size info-hashes
to be 20 bytes.

Padding in import_to_es.py may be erring on the side of caution, but
safe is established to be better than sorry.

(SQLAlchemy is unaffected by this bug)

Fixes #456
2018-02-25 15:12:35 +02:00
Arylide 0b98b2454a New help section for IRC and some prod changes I never put in the repo because lazy. 2018-02-22 23:23:53 -08:00
nyaadev 8de2663fc2 Remove deprecated torrent delete code. 2018-02-16 19:58:31 +01:00
A nyaa developer d7b413e4d7 site-specific changes for new tracker (#453) 2018-02-12 15:52:35 -08:00
Nicolas F 7bef642f4e Don't submit reports for already banned torrents (#448)
If users kept their page open for a while before reporting a
torrent, and mods got it in the meantime, users could still
submit reports for that torrent. This is silly and really doesn't
need to happen.
2018-02-08 12:12:54 -08:00
nyaadev 658eefe42a fix uncommon exception in report system
fix html style issue in admin box on user page
2018-02-06 23:05:37 +01:00
Anna-Maria Meriniemi e5fe63156d Fix flat PR (#446)
* Clean up PR #349 

- Rely on os.makedirs(..., exist_ok=True) for "thread"-safety

- Remove the previous info_dict when we know the transaction went through.

- bytes.hex() will always be lowercase (unless we go off CPython):
c3d9508ff2/Python/pystrhex.c (L5-L49)
c3d9508ff2/Python/codecs.c (L16)

- Reintroduce comments and meaningful creation dates in generated torrents:
Also make create_default_metadata_base set the correct metadata now
2018-02-04 13:56:29 +01:00
nyaadev f38d7e0707 remove broken offset option from infodict_mysql2file.py
could've probably fixed it with ORDER BY but lazy
2018-02-03 21:05:24 +01:00
A nyaa developer e7f412eb8f
Merge pull request #349 from nyaadevs/remove_info_mysql
Move bencoded info dicts from mysql torrent_info table to info_dict directory.
2018-02-03 20:22:03 +01:00
nyaadev f2411db485 fix migration 2018-02-02 20:53:46 +01:00
nyaadev d151cca4ef fix last commit 2018-02-02 20:39:02 +01:00
TheAMM a92d886b5c Name fixes, DRY 2018-02-02 20:39:02 +01:00
nyaadev fd0a02b95c Move bencoded info dicts from mysql torrent_info table to info_dict directory. DB change!
IMPORTANT!!! Make sure to run utils/infodict_mysql2file.py before upgrading the database.
2018-02-02 20:39:02 +01:00
sfan5 418856a4bf Undo responsive table for reports (#444)
It was removed on purpose in fdb041c23b.
Instead just add a CSS rule to fix the table header.
2018-02-01 11:35:04 -08:00
sfan5 0fac1c820d Update dark theme (#441)
* Update dark theme CSS

* Use reponsive table on Admin > Reports page

Fixes dark theme styling of the table header.
2018-02-01 10:50:31 -08:00
Nicolas F 0285c12264 commenting: show CAPTCHA to new accounts (#443)
Basically re-use the upload CAPTCHA code to also do this for
comments.
2018-02-01 10:50:00 -08:00
Anna-Maria Meriniemi f8a287caa0
Improve and tidy up email blacklist regexes (Hotmail) (#438)
Because reading warnings is overrated.
This does not fix people using custom domains, but it's more likely
they'll know what's up when their email is thrown into the void.

Fixes #437.
2018-01-27 01:55:35 +02:00
Leo Izen f0bd96fe8d static: losslessly optimise PNG images even more (#432)
Used TruePNG and zopflipng to optimise the images even more,
saving a whopping 4073 bytes.

The optimisation is lossless, i.e. the decoded pixel values do not
change at all.
2018-01-16 21:34:43 -08:00
TheAMM d4fcd36b1b Fix mods' comment list filter 2018-01-04 00:34:17 +02:00
Nicolas F 362b3e3dfa user: clean up admin_form HTML (#400)
Properly scaffold it, remove a superfluous form-group div, and style
the select properly.
2018-01-03 14:28:39 -08:00
Nicolas F d00f3686f7 static: losslessly optimise PNG images (#427)
Used zopflipng to optimise the images, saving a whopping 8238 bytes.

The optimisation is lossless, i.e. the decoded pixel values do not
change at all.
2017-12-22 06:44:39 -08:00
Anna-Maria Meriniemi 3941a0b9b3
Quick and dirty comment list for moderators to look at (#421) 2017-12-04 15:51:31 +02:00
JodanJodan 052a038763 Add tabindexes to login elements (#420)
Fixes issue with password managers (e.g. KeePass) tabbing to 'Forgot your password?' link instead of password field.
2017-12-02 13:31:51 +02:00
Anna-Maria Meriniemi 7f9dc622b1 Email blacklist (#419) 2017-11-22 17:19:47 -08:00
Anna-Maria Meriniemi 1f31427e5e Alert hotmail users of the void on register page (#418)
* Alert hotmail users of the void on register page

* Words
2017-11-22 16:40:23 -08:00
Anna-Maria Meriniemi 5990cf2f50 Remove tracker limit and always add our trackers (#417)
With all trackers.txt trackers being included in generated .torrents,
we can now be certain the magnet (which use trackers.txt) and the .torrent
uses will not be split up in different swarms in case the main announce dies.
(That is, if uploaders add enough of their own trackers and additional trackers
were deemed unnecessary (at least 5 already), the magnet and .torrent would only
share the main site announce)
2017-11-22 00:02:22 -08:00
TheAMM 4cdf7f4ab3 Support searching for base32 info hash (BTIH)
"BitTorrent info hashes" are generally found in magnet uris.
An info hash is 40 characters in hex and 32 in base32 so the searches won't clash.
2017-11-14 21:27:15 +02:00
TheAMM 630c69727d Torrent validation: explicitly mention missing announce-key 2017-11-13 23:01:58 +02:00