Browse Source

API documentation; TODO list; raise 400 for internal exceptions.

Stefano Cossu 1 year ago
parent
commit
ba86a594a7
4 changed files with 138 additions and 10 deletions
  1. 83 5
      README.md
  2. 46 0
      TODO.md
  3. 8 5
      transliterator/rest_api.py
  4. 1 0
      transliterator/tables/data/_cyrillic_base.yml

+ 83 - 5
README.md

@@ -16,10 +16,88 @@ Start container:
 docker run -e TXL_FLASK_SECRET=changeme -p 8000:8000 transliterator:latest
 ```
 
-Test service:
+## Web UI
 
-```
-curl localhost:8000/health
-```
+`/` renders a simple HTML form to test the transliteration service.
+
+
+## REST API
+
+### `GET /health`
+
+Useful endpoint for health checks.
+
+#### Response code
+
+`200 OK` if the service is running.
+
+### `GET /languages`
+
+List all the languages supported.
+
+#### Response code
+
+`200 OK`
+
+#### Response body
+
+MIME type: `application/json`
+
+Content: a JSON object of the supported language tables. Keys are the keywords
+used throughout the API, e.g. for `/transliterate`. Each key is paired with an
+object that contains some basic metadata about the language features. At the
+moment, only the human-readable name is available.
+
+### `GET /table/<lang>`
+
+Dump a language table.
+
+#### URI parameters
+
+- `<lang>`: Language code as given by the `/languages` endpoint. 
+
+#### Response code
+
+`200 OK`
+
+#### Response body
+
+MIME type: `application/json`
+
+Content: language configuration as a JSON object with all the transliteration
+rules as they are read by the application. If the table inherits from a parent,
+the computed values from the merged tables are shown.
+
+### `POST /transliterate/<lang>[/r2s]`
+
+Transliterate an input string in a given language.
+
+#### URI parameters
+
+- `<lang>`: Language code as given by the `/languages` endpoint. 
+- `r2s`: if appended to the URI, the transliteration is intended to be
+  Roman-to-script, and the input string should be Latin text. If not, the
+  default behavior is followed, which is interpreting the input as a script
+  in the given language, and returning the Romanized text.
+
+#### POST body
+
+- `text`: Input text to be transliterated.
+
+#### Response code
+
+- `200 OK` on successful operation.
+- `400 Bad Request` for an invalid request. The reason for the failure is
+  normally printed in the response body.
+
+#### Response body
+
+MIME Type: `text/plain`
+
+Content: transliterated string. Characters not found in the mapping are copied
+verbatim (see "Configuration files" section for more information).
+
+
+## Configuration files
 
-TODO: API endpoints are stubs at the moment.
+TODO

+ 46 - 0
TODO.md

@@ -0,0 +1,46 @@
+# Brief TODO list
+
+*P* = pengding; *W* = working no it; *D* = done; *B* = blocked (needs
+discussion, etc.); *X* = not implementing.
+
+- *D* Basic table loading & parsing
+- *D* Table inheritance
+- *P* Multiple inheritance (not recursive)
+- *D* Ignore list (R2S)
+- *D* Basic transliteration in both directions
+- *D* Basic REST API
+- *D* Basic UI
+- *P* Life cycle hooks for plugins
+- *P* API documentation
+- *W* Complete conversion of existing tables to YAML
+  - *P* Arabic
+  - *P* Armenian
+  - *P* Azerbajani
+  - *D* Belarusian
+  - *P* Bulgarian
+  - *D* Chinese
+  - *P* Ethiopic
+  - *P* Georgian
+  - *P* Greek
+  - *P* Hebrew and Yiddish
+  - *X* Japanese
+  - *P* Kazakh
+  - *P* Korean
+  - *P* Kyrgyz
+  - *P* Mongolian
+  - *P* Persian
+  - *P* Pushto
+  - *D* Russian
+  - *P* Serbian
+  - *P* Slavonic
+  - *P* Tajik
+  - *P* Tatar
+  - *P* Thaana
+  - *P* Turkmen
+  - *D* Ukrainian
+  - *P* Urdu
+  - *P* Uzbek
+- *P* Additional languages not in legacy tables, but in other software
+  - *B* Arabic S2R (ArabicTransliterator)
+  - *B* Japanese (?)
+  - *B* Korean (K-romanizer)

+ 8 - 5
transliterator/rest_api.py

@@ -12,8 +12,8 @@ def create_app():
     app.config.update({
         "ENV": flask_env,
         "SECRET_KEY": environ["TXL_FLASK_SECRET"],
-        # Prod requires the application to be behind Nginx, or static files
-        # won't be served directly by Flask using this option.
+        # Prod requires the application to be behind a web server, or static
+        # files won't be served directly by Flask using this option.
         "USE_X_SENDFILE": flask_env == "production",
         "JSON_AS_ASCII": False,
         "JSONIFY_PRETTYPRINT_REGULAR": True,
@@ -62,9 +62,12 @@ def transliterate_req(lang, r2s=False):
     if not len(in_txt):
         return ("No input text provided! ", 400)
 
-    rsp = Response(
-            transliterate(in_txt, lang, r2s),
-            mimetype="text/plain")
+    try:
+        out = transliterate(in_txt, lang, r2s)
+    except (NotImplementedError, ValueError) as e:
+        return (str(e), 400)
+
+    rsp = Response(out, mimetype="text/plain")
     rsp.headers["Content-Type"] = "text/plain; charset=utf-8"
 
     return rsp

+ 1 - 0
transliterator/tables/data/_cyrillic_base.yml

@@ -20,6 +20,7 @@ roman_to_script:
     # dedicated U+2160÷U+216F (uppercase Roman
     # numerals) and/or U+2170÷U+217F (lower case Roman
     # numerals) ranges to avoid this ambiguity.
+    # TODO implement regular expressions for ignore patterns.
     #- re: "I{2,3}"
     #- re: "I(V|X)"
     #- re: "LI{,3}"