|
@@ -43,9 +43,14 @@ such as `publisher not identified=publisher not identified`.
|
|
|
|
|
|
Q: Shall spaces around the `=` sign be ignored?
|
|
Q: Shall spaces around the `=` sign be ignored?
|
|
|
|
|
|
|
|
+A (RB): Very likely yes, spaces are represented in the legacy files by
|
|
|
|
+underscores.
|
|
|
|
+
|
|
Q: What are the `_` at the end of some mappings, e.g. `U+4E00=yi_` for Chinese?
|
|
Q: What are the `_` at the end of some mappings, e.g. `U+4E00=yi_` for Chinese?
|
|
Are they supposed to add a space where the underscore appears?
|
|
Are they supposed to add a space where the underscore appears?
|
|
|
|
|
|
|
|
+A (RB): Yes.
|
|
|
|
+
|
|
## `ReRomanizeRecord.bas`
|
|
## `ReRomanizeRecord.bas`
|
|
|
|
|
|
Much of the code deals with MARC records. No need to concern about that since
|
|
Much of the code deals with MARC records. No need to concern about that since
|
|
@@ -55,10 +60,15 @@ Q: Is it possible (and desirable) to determine the S2R/R2S direction from user
|
|
prompt rather than guessing it from the text as the legacy software seems to
|
|
prompt rather than guessing it from the text as the legacy software seems to
|
|
be doing?
|
|
be doing?
|
|
|
|
|
|
|
|
+A (RB): Yes.
|
|
|
|
+
|
|
Q: The software seems to take multi-line directives in the configuration into
|
|
Q: The software seems to take multi-line directives in the configuration into
|
|
account. Is it possible to avoid these for simplicity, or is there a need to
|
|
account. Is it possible to avoid these for simplicity, or is there a need to
|
|
express some mapping in multiple lines?
|
|
express some mapping in multiple lines?
|
|
|
|
|
|
|
|
+A (RB): Long lines may be needed. (SC) This would be moot in YAML that supports
|
|
|
|
+multi-line strings via folding.
|
|
|
|
+
|
|
Detailed breakdown of individual functions follows.
|
|
Detailed breakdown of individual functions follows.
|
|
|
|
|
|
|
|
|
|
@@ -197,11 +207,15 @@ This is the logic of the romanization process by character or syllable.
|
|
|
|
|
|
#### `EvaluateFirstCharacter`
|
|
#### `EvaluateFirstCharacter`
|
|
|
|
|
|
-This determines if the translation is R2V or V2R. Does this work reliably and
|
|
|
|
|
|
+This determines if the translation is R2S or S2R. Does this work reliably and
|
|
independently of any external directive? Could there be some strings in foreign
|
|
independently of any external directive? Could there be some strings in foreign
|
|
scripts that start with Latin characters (e.g. numbers or Western terms), and
|
|
scripts that start with Latin characters (e.g. numbers or Western terms), and
|
|
lead to unexpected results?
|
|
lead to unexpected results?
|
|
|
|
|
|
|
|
+A (RB): Some non-Latin scripts may start with Latin characters. (SC) Let's use
|
|
|
|
+an explicit direction option from the user. The logic would be too complicated
|
|
|
|
+and flimsy.
|
|
|
|
+
|
|
(Also the translation is supposedly purpose-driven, as the user should have a
|
|
(Also the translation is supposedly purpose-driven, as the user should have a
|
|
specific direction in mind and wouldn't want the software to decide for them.)
|
|
specific direction in mind and wouldn't want the software to decide for them.)
|
|
|
|
|
|
@@ -210,6 +224,9 @@ specific direction in mind and wouldn't want the software to decide for them.)
|
|
|
|
|
|
Replace apostrophe characters with glyphs supported by foeign script?
|
|
Replace apostrophe characters with glyphs supported by foeign script?
|
|
|
|
|
|
|
|
+Coment (RB): More clarity is needed around Latin and non-Latin punctuation to
|
|
|
|
+be used. (SC): TODO More to be discussed via conf call.
|
|
|
|
+
|
|
|
|
|
|
#### Field- and UI-related functions
|
|
#### Field- and UI-related functions
|
|
|
|
|
|
@@ -233,6 +250,11 @@ MARC markers. Do we need to deal with these manually as indicators related to
|
|
the script/language handled, or shall we expect any text string input in the
|
|
the script/language handled, or shall we expect any text string input in the
|
|
new Transliterator to be clean from MARC flags?
|
|
new Transliterator to be clean from MARC flags?
|
|
|
|
|
|
|
|
+A (RB): Most MARC markers are obsolete; but there may be other markers that
|
|
|
|
+are not easily transliterated, e.g. BIBFRAME markers. More discussion is needed
|
|
|
|
+on this point. (SC) Need feedback from KF + MM about what to expect from input
|
|
|
|
+and output string in this regard.
|
|
|
|
+
|
|
#### `RomanizeConvertDecimalChars`
|
|
#### `RomanizeConvertDecimalChars`
|
|
|
|
|
|
Convert escape sequences `&#\d{4,5}` to code points.
|
|
Convert escape sequences `&#\d{4,5}` to code points.
|