Overhaul README.md with more helpful and more up-to-date instructions

2025-04-04 19:09:27 +00:00 · 2017-05-28 02:16:25 +03:00 · 2017-05-28 02:16:25 +03:00 · 926c85058e
parent 05713892f7
commit 926c85058e
1 changed files with 87 additions and 59 deletions
--- a/README.md
+++ b/README.md
@ -1,82 +1,110 @@
 # NyaaV2

-## Setup
+## Setting up for development
+This project uses Python 3.6. There are features used that do not exist in 3.5, so make sure to use Python 3.6.
+This guide also assumes you 1) are using Linux and 2) are somewhat capable with the commandline.
+It's not impossible to run Nyaa on Windows, but this guide doesn't focus on that.

+### Code Quality:
+- Before we get any deeper, remember to follow PEP8 style guidelines and run `./lint.sh` before committing.
+    - You may also use `pycodestyle nyaa/ --show-source --max-line-length=100´ to see a list of warnings/problems instead of having `lint.sh` making modifications for you
+- Other than PEP8, try to keep your code clean and easy to understand, as well. It's only polite!
+
+### Setting up Pyenv
+pyenv eases the use of different Python versions, and as not all Linux distros offer 3.6 packages, it's right up our alley.
 - Install dependencies https://github.com/pyenv/pyenv/wiki/Common-build-problems
 - Install `pyenv` https://github.com/pyenv/pyenv/blob/master/README.md#installation
 - Install `pyenv-virtualenv` https://github.com/pyenv/pyenv-virtualenv/blob/master/README.md
- `pyenv install 3.6.1`
- `pyenv virtualenv 3.6.1 nyaa`
- `pyenv activate nyaa`
+- Install Python 3.6.1 with `pyenv` and create a virtualenv for the project:
+    - `pyenv install 3.6.1`
+    - `pyenv virtualenv 3.6.1 nyaa`
+    - `pyenv activate nyaa`
 - Install dependencies with `pip install -r requirements.txt`
 - Copy `config.example.py` into `config.py`
- Change TABLE_PREFIX to `nyaa_` or `sukebei_` depending on the site
+    - Change `SITE_FLAVOR` in your `config.py` depending on which instance you want to host

-### Setting up MySQL/MariaDB database for advanced functionality
+### Setting up MySQL/MariaDB database
+You *may* use SQLite but the current support for it in this project is outdated and rather unsupported.
 - Enable `USE_MYSQL` flag in config.py
 - Install latest mariadb by following instructions here https://downloads.mariadb.org/mariadb/repositories/
    - Tested versions: `mysql  Ver 15.1 Distrib 10.0.30-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2`
- Run the following commands logged in as your root db user:
+- Run the following commands logged in as your root db user (substitute for your own `config.py` values if desired):
    - `CREATE USER 'test'@'localhost' IDENTIFIED BY 'test123';`
-    - `GRANT ALL PRIVILEGES ON * . * TO 'test'@'localhost';`
+    - `GRANT ALL PRIVILEGES ON *.* TO 'test'@'localhost';`
    - `FLUSH PRIVILEGES;`
    - `CREATE DATABASE nyaav2 DEFAULT CHARACTER SET utf8 COLLATE utf8_bin;`
- To setup and import nyaa_maria_vx.sql:
-    - `mysql -u <user> -p nyaav2`
-    - `DROP DATABASE nyaav2;`
-    - `CREATE DATABASE nyaav2 DEFAULT CHARACTER SET utf8 COLLATE utf8_bin;`
-    - `SOURCE ~/path/to/database/nyaa_maria_vx.sql`

 ### Finishing up
- Run `python db_create.py` to create the database
- Load the .sql file
-    - `mysql -u user -p nyaav2`
-    - `SOURCE cocks.sql`
-    - Remember to change the default user password to an empty string to disable logging in
+- Run `python db_create.py` to create the database and import categories
+    - Follow the advice of `db_create.py` and run `./db_migrate.py stamp head` to mark the database version for Alembic
 - Start the dev server with `python run.py`
- When you are finished developing, deactivate your virtualenv with `source deactivate`
+- When you are finished developing, deactivate your virtualenv with `pyenv deactivate` or `source deactivate` (or just close your shell session)

-## Enabling ElasticSearch
+You're now ready for simple testing and development!
+Continue below to learn about database migrations and enabling the advanced search engine, Elasticsearch.

-### Basics
- Install jdk `sudo apt-get install openjdk-8-jdk`
- Install elasticsearch https://www.elastic.co/guide/en/elasticsearch/reference/current/deb.html
- `sudo systemctl enable elasticsearch.service`
- `sudo systemctl start elasticsearch.service`
- Run `curl -XGET 'localhost:9200'` and make sure ES is running
- Optional: install Kabana as a search frontend for ES
-
-### Enable MySQL Binlogging
- Add the `[mariadb]` bin-log section to my.cnf and reload mysql server
- Connect to mysql
- `SHOW VARIABLES LIKE 'binlog_format';`
-    - Make sure it shows ROW
- Connect to root user
- `GRANT REPLICATION SLAVE ON *.* TO 'test'@'localhost';` where test is the user you will be running `sync_es.py` with
-
-### Setting up ES
- Run `./create_es.sh` and this creates two indicies: `nyaa` and `sukebei`
- The output should show `acknowledged: true` twice
- The safest bet is to disable the webapp here to ensure there's no database writes
- Run `python import_to_es.py` with `SITE_FLAVOR` set to `nyaa`
- Run `python import_to_es.py` with `SITE_FLAVOR` set to `sukebei`
- These will take some time to run as it's indexing
-
-### Setting up sync_es.py
- Sync_es.py keeps the ElasticSearch index updated by reading the BinLog
- Configure the MySQL options with the user where you granted the REPLICATION permissions
- Connect to MySQL, run `SHOW MASTER STATUS;`.
- Copy the output to `/var/lib/sync_es_position.json` with the contents `{"log_file": "FILE", "log_pos": POSITION}` and replace FILENAME with File (something like master1-bin.000002) in the SQL output and POSITION (something like 892528513) with Position
- Set up `sync_es.py` as a service and run it, preferably as the system/root
- Make sure `sync_es.py` runs within venv with the right dependencies
-
-Enable the `USE_ELASTIC_SEARCH` flag in `config.py`, restart the application, and you're good to go.

 ## Database migrations
- Uses [flask-Migrate](https://flask-migrate.readthedocs.io/)
- Run `./db_migrate.py db migrate` to generate the migration script after database model changes.
- Take a look at the result in `migrations/versions/...` to make sure nothing went wrong.
- Run `./db_migrate.py db upgrade` to upgrade your database.
+- Database migrations are done with [flask-Migrate](https://flask-migrate.readthedocs.io/), a wrapper around [Alembic](http://alembic.zzzcomputing.com/en/latest/).
+- If someone has made changes in the database schema and included a new migration script:
+    - If your database has never been marked by Alembic (you're on a database from before the migrations), run `./db_migrate.py stamp head` before pulling the new migration script(s).
+        - If you already have the new scripts, check the output of `./db_migrate.py history` instead and choose a hash that matches your current database state, then run `./db_migrate.py stamp <hash>`.
+    - Update your branch (eg. `git fetch && git rebase origin/master`)
+    - Run `./db_migrate.py upgrade head` to run the migration. Done!
+- If *you* have made a change in the database schema:
+    - Save your changes in `models.py` and ensure the database schema matches the previous version (ie. your new tables/columns are not added to the live database)
+    - Run `./db_migrate.py migrate -m "Short description of changes"` to automatically generate a migration script for the changes
+      - Check the script (`migrations/versions/...`) and make sure it works! Alembic may not able to notice all changes.
+    - Run `./db_migrate.py upgrade` to run the migration and verify the upgrade works.
+       - (Run `./db_migrate.py downgrade` to verify the downgrade works as well, then upgrade again)

-## Code Quality:
- Remember to follow PEP8 style guidelines and run `./lint.sh` before committing.
+
+## Setting up and enabling Elasticsearch
+
+### Installing Elasticsearch
+- Install JDK with `sudo apt-get install openjdk-8-jdk`
+- Install [Elasticsearch](https://www.elastic.co/downloads/elasticsearch)
+    - [From packages...](https://www.elastic.co/guide/en/elasticsearch/reference/current/deb.html)
+        - Enable the service:
+            - `sudo systemctl enable elasticsearch.service`
+            - `sudo systemctl start elasticsearch.service`
+    - or [simply extracting the archives and running the files](https://www.elastic.co/guide/en/elasticsearch/reference/current/_installation.html), if you don't feel like permantently installing ES
+- Run `curl -XGET 'localhost:9200'` and make sure ES is running
+    - Optional: install [Kibana](https://www.elastic.co/products/kibana) as a search debug frontend for ES
+
+### Setting up ES
+- Run `./create_es.sh` to create the indices for the torrents: `nyaa` and `sukebei`
+    - The output should show `acknowledged: true` twice
+- Stop the Nyaa app if you haven't already
+- Run `python import_to_es.py` to import all the torrents (on nyaa and sukebei) into the ES indices.
+    - This may take some time to run if you have plenty of torrents in your database.
+
+Enable the `USE_ELASTIC_SEARCH` flag in `config.py` and (re)start the application.
+Elasticsearch should now be functional! The ES indices won't be updated "live" with the current setup, continue below for instructions on how to hook Elasticsearch up to MySQL binlog.
+
+However, take note that binglog is not necessary for simple ES testing and development; you can simply run `import_to_es.py` from time to time to reindex all the torrents.
+
+### Enabling MySQL Binlogging
+- Edit your MariaDB/MySQL server configuration and add the following under `[mariadb]`:
+    ```
+    log-bin
+    server_id=1
+    log-basename=master1
+    binlog-format=row
+    ```
+- Restart MariaDB/MySQL (`sudo service mysql restart`)
+- Copy the example configuration (`es_sync_config.example.json`) as `es_sync_config.json` and adjust options in it to your liking (verify the connection options!)
+- Connect to mysql as root
+    - Verify that the result of `SHOW VARIABLES LIKE 'binlog_format';` is `ROW`
+    - Execute `GRANT REPLICATION SLAVE ON *.* TO 'username'@'localhost';` to allow your configured user access to the binlog
+
+
+### Setting up sync_es.py
+`sync_es.py` keeps the Elasticsearch indices updated by reading the binlog and pushing the changes to the ES indices.
+- Make sure `es_sync_config.json` is configured with the user you grated the `REPLICATION` permissions
+- Run `import_to_es.py` and copy the outputted JSON into the file specified by `save_loc` in your `es_sync_config.json`
+- Run `sync_es.py` as-is *or*, for actual deployment, set it up as a service and run it, preferably as the system/root
+    - Make sure `sync_es.py` runs within the venv with the right dependencies!
+
+You're done! The script should now be feeding updates from the database to Elasticsearch.
+Take note, however, that the specified ES index refresh interval is 30 seconds, which may feel like a long time on local development. Feel free to adjust it or [poke Elasticsearch yourself!](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html)