Curl Global Community
serialization for selections from a large off-line Japanese text resource - Printable Version

+- Curl Global Community (https://communities.curl.com)
+-- Forum: Blogs (https://communities.curl.com/forumdisplay.php?fid=17)
+--- Forum: Tech blog (https://communities.curl.com/forumdisplay.php?fid=18)
+---- Forum: Robert blog (https://communities.curl.com/forumdisplay.php?fid=20)
+---- Thread: serialization for selections from a large off-line Japanese text resource (/showthread.php?tid=516)



serialization for selections from a large off-line Japanese text resource - RobertShiplett - 06-04-2012

I am making a separate note here because permitting a large text resource such as JMDict2 to stand off-line while async worker requests bring in serialized segments is where I will go next - and that means Curl in xcurl scripts running off-site.

It is a no-SQL solution that appeals to me. A single flat file might appear to be the answer (and I am an old fan of single-file Smalltalk bytecode images) but JMDict is too full of the obscure and the technical to stand as one file for an on-line on-SQL Japanese-English dictionary.

My efforts at serialization to-date can be seen at

http://www.aule-browser.com/kanji/poets/basho-complete-indexed.html

http://www.aule-browser.com/kanji/poets/basho-ame-indexed.html

http://www.aule-browser.com/kanji/poets/basho-mee-indexed.html

All definitions in English for Kanji in kanjidic2 ยท 13MB XML reduced to 500k of serialized Curl array of Definition objects.

http://www.aule-browser.com/kanji/kanjidic2-all-indexed.html