serialization for selections from a large off-line Japanese text resource - Printable Version +- Curl Global Community (https://communities.curl.com) +-- Forum: Blogs (https://communities.curl.com/forumdisplay.php?fid=17) +--- Forum: Tech blog (https://communities.curl.com/forumdisplay.php?fid=18) +---- Forum: Robert blog (https://communities.curl.com/forumdisplay.php?fid=20) +---- Thread: serialization for selections from a large off-line Japanese text resource (/showthread.php?tid=516) |
serialization for selections from a large off-line Japanese text resource - RobertShiplett - 06-04-2012 I am making a separate note here because permitting a large text resource such as JMDict2 to stand off-line while async worker requests bring in serialized segments is where I will go next - and that means Curl in xcurl scripts running off-site. It is a no-SQL solution that appeals to me. A single flat file might appear to be the answer (and I am an old fan of single-file Smalltalk bytecode images) but JMDict is too full of the obscure and the technical to stand as one file for an on-line on-SQL Japanese-English dictionary. My efforts at serialization to-date can be seen at http://www.aule-browser.com/kanji/poets/basho-complete-indexed.html http://www.aule-browser.com/kanji/poets/basho-ame-indexed.html http://www.aule-browser.com/kanji/poets/basho-mee-indexed.html All definitions in English for Kanji in kanjidic2 ยท 13MB XML reduced to 500k of serialized Curl array of Definition objects. http://www.aule-browser.com/kanji/kanjidic2-all-indexed.html |