a949408 (HEAD -> master, origin/master, origin/HEAD) ICU is namespaced in newer versions (fixes #42) e272cd2 Change correction finding order (issue #28); Require C++1y (C++14) 0521ea9 (tag: v0.5.0) Copyright extends to 2018. e16dcbb Ready for release 0.5.0. 678d45d Treat all tests as XFAIL_TESTS if zhfst support is disabled. a3f39d9 Merge pull request #31 from hfst/configure-explicit-error bc1c9ea Separate counters (fixes #37, closes #38) a4aeae0 Post-case-mangling unique test 1d088e1 Avoid shadowing multicharacter ascii-beginning symbols 69ab95d Merge pull request #35 from unhammer/renamespace-hfst_ol d39b93c rename hfst_ol namespace to hfst_ospell to avoid conflicts eccbd6f Bump actual version; Modernize for() loops a9919f4 Merge pull request #33 from unhammer/analysisSymbols f36de02 bump minor version because of new API function 019f303 New API fn analyseSymbols giving unconcatenated vector<string>'s da508b7 Dirty hack to parse XML a bit better; ToDo: use actual portable XML library dad7c5c Minimal XML parsing to get locale, title, and description for Voikko and other frontends f40d4b5 (origin/configure-explicit-error) let './configure --enable-zhfst' fail if no libarchive found a737b52 When using help2man remember -n description 1fc3fca Update copyright and authors listing d791a41 (tag: v0.4.5) Release 0.4.5. 4f2026c Add missing files to dist. bae21c3 Forgot to commit configure.ac... Update hfst-ospell manual and add man pages to dist. 7c8690a (tag: v0.4.4) Ready for release 0.4.4. 671efe6 Trusty has C++11 32842b1 Disable nightly repo 2a4e556 Move more files 0ab57c9 Remove HFST dependency - fixes #26; Move tests to folder 687b3c1 The --with-extract arg now sets the default - the other option is still tried if default fails; Oddly enough, this fixes issue #26 for no good reason - it's happily extracting to memory now, when it wasn't before. f061774 Apply the notimestamp patch in trunk 3322fd2 Extra early checks for archive extraction failures 0cc1a4a Use std::runtime_exception 14b1b2c Rename stdafx.h and install it 4569e9f Windows DLL fixes 88aa8dd Allow building without XML support 139790f (tag: v0.4.3) Ready for release 0.4.3. 9562377 Fixes for big endian conversions; Make use of AC_C_BIGENDIAN and WORDS_BIGENDIAN; UCHAR_MAX should be +1 to allow hitting index 255 bf79d25 Add returns 8573403 Merge branch 'big_endian_compatibility_suggestion' In hopes that this approach to fixing build problems on the problematic architectures will get tested for Debian builds. This is experimental, but at least shouldn't break anything - feel free to revert if unhelpful. b756b2e Killed an error. Don't know whether it was the correct solution - please review. 3ca7756 max version for tinyxml 662f79c (origin/big_endian_compatibility_suggestion) Forgot to actually assign header length to a variable when reading raw bytes abed65f A suggestion for big-endian compatibility 43f5dc6 (tag: v0.4.2) Release 0.4.2. e9cc1da Prevent segfaults from no-errormodel.sh test This also makes the test succeed rather than XLFAIL, so removed it from XFAIL_TESTS (why not?) 6d359c0 Remove unused $(HFST_FLAGS) variable that may be leading to some test failures b31fec1 markdown 9fa49a2 not supporting default autostuff like promised... c53090d Merge branch 'master' of github.com:hfst/hfst-ospell 4b07bf5 experimental ci f582bac (tag: v0.4.1) Release 0.4.1. 29649b1 Set gitignores. 31e6842 Bump patch version since revision has decreased d2fb71e set_time_cutoff(6.0) 08bc7a9 When upping to 16 variants, also make room for all 16 variants 080042a Added section header, correct tool name to man page. a95e4aa 0.4.0 2565275 Release 0.4.0. 3e31ef1 Update help message, copyright extends to 2016. 6ca58bc Make no-errormodel.sh an xfail test. 047e9d9 Only call clock() every millionth time for speed ae8a660 Fix time cutoff handling b222053 Update man pages 96e17f7 Update man page d92e8a7 If the input token had any upper case, also try a first-upper variant of it 5fee947 Special case nuvviDspeller b5946d9 Bugfix: class -> struct (libvoikko complained under OSX, and err’ed out). 929f926 If we are looking for suggestions, don't use the cache 04e4843 Fix a couple of typos in argument list and verbose prints. Add a missing statement that actually sets the time cutoff for the speller. cf8ee5c Fix a typo in wide string conversion function. 30940be Time cutoff option during correction for library and demo app 77bed9b Apply Brendan Molloy's alignment patch for ARM CPUs & reindent to namespace=0 da6fcc5 Propagate first-upper and all-upper to suggestions b57d248 Add option --enable-hfst-ospell-office to configure (defaults to yes). Add support for using libarchive2 if libarchive3 is not available. 26a4e37 Add --verbatim to disable the variations ece3455 Don't use UTF-8 length for UTF-16 buffer 3b07200 I meant 20480 df80db9 Use ICU to determine what's alphanumeric - should fix https://github.com/TinoDidriksen/spellers/issues/1 bc4b3a2 If pkg-config fails, fall back on icu-config - blame Ubuntu 12.04 3266e7b Silly mistake 5226fb2 Use ICU instead be5f73a Use wide chars during case folding; Set locale 90d7fcb Fix typo that broke UNKNOWN handling the the error source 0ff3b88 Fix bug in novel symbol tokenization 791ca3e Bounds check table accesses Now that new symbols from input can be added to the lexicon's alphabet, it's possible that during checking we try to access beyond the lexicon's table boundaries, so that needs to be prevented. 0865615 CentOS 6 fixes: Move AC_INIT up to prevent _m4_divert_diversion error; Use older ax_check_compile_flag.m4 833e6f6 autoconf-archive doesn't exist on RPM platforms, so bundle ax_check_compile_flag.m4 e59a06e Determine and use highest supported C++ standard f8d3893 Reinstate legacy function Transducer::lookup() due to Voikko's use of it b645b28 Unknown and identity arcs can now appear both in the lexicon and error model This is a rather large overhaul of everything having to do with symbols, and causes an approximately 8% slowdown at this time. Some more efforts will be made to bring that down.. 28cd37b Make weight checking methods const d76526a Correct nbest handling for cached results too + remove extranous comparison class a10455e Special-case limit checking for nbest 568d606 Correct limit setting 22bd0b5 Remove debugging print ed57354 Use a list to emulate a sorted double-ended weight queue effe39d Removed keyword 'typename', it caused GCC with C++03 to bark. 57891b8 Drop down to boring ol' C++03 524f259 Fix declaration order and unicode issues 1feb23e Add --default-weight, --outputfile, pair specification syntax for regex output d40e733 Add a --regex mode to produce a large regular expression with weighted replace rules 612aa05 Add a first-symbol cache and do some reorg 6a94a9b Fix -Wall -Wextra issues 2fcd06e New protocol: Let client choose max suggestions, input must now match "^(\d+) (.+)$" dcb9474 Search up the normalization list until something gives suggestions ad14157 Disable log2 max-dist and instead set hard limit due to FST weights e7cfbc6 Tokenization is done by the host, so don't second guess e484ef1 Add hfst-ospell-office helper so the algorithm is in one place, instead of in every frontend 14c72ca Arguably clearer scheme for combining limiting options 0c5eeec Reinstate legacy spelling, slightly adding to but not altering interface 4d1fa2b Add option --beam=W for restricting the search to a margin above the optimum. Any combination of n-best, weight limit and beam is possible, all restrict each other. 29180f3 Use vector instead of deque now that breadth search isn't used for anything ea15127 Be a bit more systematic about search-limiting behaviour, also fail earlier, this gives a small speed improvement a9d9e55 Make argument parsing further agree with documentation, reinstating -X e2b9b7e nbest wasn't producing any runtime benefit due to incorrect comparison 94064b5 Reinstate the : lost from the -n arg in getopt_long() b164f2b Change max_weight to max-weight in the help string and argument name 48d6d15 Check that maxweight is positively set before using it, default is negative a782763 Version 0.3.1 - this introduces API/linking changes Support limiting the correction search to a maximum weight. The hfst-ospell command supports this via --max_weight / -w. 76d9052 Make -n (max n corrections) work again 4d6f7dd Have separate switches for each XML mangling library d39d419 Updated hfst-ospell man page. 1f38ba3 Just error out when missing XML d7058e8 Allow selecting real-word errors 8866483 Missing stdarg reqd in >gcc 4.8 or somesuch 561760e Added configure and Makefile files for Windows compilation. f19070d A tiny performance improvement (about 5%, this is not what was discussed on irc, but every little helps) a674152 Fix missing newline in verbose message ba22ebb This is how I got the analysing speller to work f884489 Simple example of analysis spelling logic 4019232 can analyse if speller or sugger f621a37 Fix bug #239 When correcting, keep track of mutator's output / lexicon's input; when analysing, keep track of lexicon's output bfd40d9 Fix bug #238 Now that alphabets are translated even if the error source contains symbols not found in the lexicon, the lexicon needs to be prepared to be asked whether there are transitions with NO_SYMBOL f7b6dcc documentation front page 0a17be0 Dist changes, 0.3.0 API stable though pending other fixes for .1 03d40cc Some beginnings for a documented stable API. a8dc221 This has been missing all along? 7f70384 Allow setting max suggestions from command line tool edc36e7 Remove legacy, allow setting suggestion limits a61dcd2 Tentative support for analyse-as-you-error-correct 8847e56 Errmodel with out of lexicon alphabet fails f702e28 Better test case for analyser c96a2e8 Add tests for analysing speller and missing errmodels 104a24d Analysis interface and implementation for two-tape language models. The suggestion engine still erroneously gives the analysis side of suggestion LM though. 75fc585 Now printing and reading to/from console works in hfst-ospell is it is compiled with -DWINDOWS. 94031ca Added a rule for man page creation in Makefile.am. e293f96 Added a test for an empty zhfst file. It should fail, but presently hfst-ospell behaves as if all is ok, just that it isn't able to correct anything. add8ddd Corrected return value to what it should be for a SKIP result, which I assume was intended. 09ea12d Added files to the distro. de621bd Added test to check whether error models with extra chars are correctly handled. The test passes with revision 3793 of ospell.cc, and fails with revision 3729. 4860369 Silently ignore non-lexicon symbols in the error source, translating them to NO_SYMBOL on the output side. 9e8ae05 revert 8e74395 What happens if I do this? 41bb8cf Moving implementations? 9629091 Install men 5a0fbdb A man page for debian's lint f4850b0 Probably libarchive deprecation will go away with different function name here too f76b1fa Move some of the code away from headers db0c56d some old changes on work desktop? 16511d0 Ensure other potentially empty tags for libxmlpp 832f9d0 Bumped version for tinyxml2 check to get lengthful Parse by Børre Gaup 201acd8 unarchive to mem and tinyxml2 xml length fixes from Harri Pitkänen 220e6cb Reverted the patch by Børre to clean the xml off of random byte noise after the end of the xml document. The issue was caused by zip compression incompatibilities between different plattforms, and the solution (for now) is to just skip compression altogether, using 'zip -Z store' instead when building the zhfst files. This creates a zip file readable everywhere (but without compression). c57f0bd Test for trailing spaces in xml structure ad60602 Patch by Børre Gaup to clean corrupted xml data returned from Libarchive when reading xml data from a zhfst file stored in RAM: b0ba291 Whitespace changes to make the file easier to my eyes, comments. Added tests for empty locale and empty title elements. 42da63d Data and shell scripts to test empty titles and empty locale. 6b5145f Check for empty locale node. 0838caa Clean and ignore index.xml. 26e6b99 One more test for empty data. 2b1089f Added newline before error cause. a440da7 Empty descriptions will throw (there might be others left) 58047b3 index.xml with empty descriptions 2f93bcc Test.strings shall not be deleted when cleaning c4adeb0 Added check for the availability of pkg-config - without it configuration will fail in subtle but bad ways. cd24885 Document for next release candidate 3365813 Missing model attribute parsing for errm 5ca88f6 Use Elements instead of Nodes and other such fixes bd8eada Use automake conditionals to avoid pkg-config linking to libraries that are not in configure's code paths 0322db3 update configuration for that 2153668 Add tinyxml2 versions of XML parsing 658ff68 conference demos as configure option b2ff3fb stats -> status e20d008 Allow to end correct correction c16d604 Merge lookup() from devbranch 0761df3 This should be useful fb1c6c1 Fix the output tsv a bit more 299b94d Slight udpate for new measure ments 44f6276 Few starts f25c609 Allow selection of xml backend 4be4b96 Switch <3 into >3 because it looks nicer. 5d80f3f strndup() fixed by Tommi. e11a913 Pending proper understanding of why strndup is getting defined several times over, let's always use hfst_strndup() instead. 248d05d Add include guard to custom strndup so it doesn't get compiled more than once af41cbf Move our custom strndup to ospell.h so it will be seen by all compilation units that need it bd813b0 Renamed the hfstospell package and variables to what they used to be, in case that can help solve a build issue with libvoikko+hfst. 83b1a09 Wrap and load f03ef00 versioninfo 3:1:1 d70a9e6 Preparing for 0.2.3 release. c8ba084 Changed version number to 0.2.3, and renamed the package and tool name to 'hfst-ospell', since that is what the command-line tool is actually called, and it is consistent with the rest of the hfst tool names. 3d40399 Should fix #176 Flag state from the stack back was getting clobbered in some situations. Now we restore flag state after modifying it for new nodes. ac10ddd Fix dist stuff, wrap. 3fa2b5d Use pkg-config instead of autoconf to check libarchive 7e422b7 Prepare files for 0.2.2 89a4a99 Fix few leaks 44b313c Fixes to [#165] by Harri Pitkänen 3ad822b Made some changes to depth-first searching to prevent just-added nodes getting removed. This appears fo fix a very long-standing and serious bug dating from r2763. 36fa55c When data is in memory, allocate & copy it in ospell rather than expect caller to keep track of it (which it usually doesn't) 8196d8a Add access to metadata 1da34c4 When initialised from memory, don't assume responsibility for freeing it 01a43fd Somewhat experimental (no checking) way of saving memory by ignoring alignment issues (4-5 x memory savings) db95a32 Remove leftover agenda variable 2ee0efc Add stuff for survey article a439eaf Revert to breadth-first searching pending bugfix 8166a16 Search depth-first by preference for hope 876ebbf Forgot some important checks that we actually want to limit the results b1b6b24 Enforce an n-best limit for continuing the search, just for breadth-first for now 78f2345 Don’t just break on empty lines ab3f591 Switch order of errmodels and acceptors in legacy read 5baaaed set spelling caps on legacy read 7c60d3e Test fallback a1b1e35 The VPATH approach to the test shell scripts was wrong. Now everything is working as it should. fcef585 Enabling VPATH building for 'make check'. f8d7a54 Whitespace change only in configure.ac. More generated files to ignore. b6f6482 Added some whitespace to ease readability. Replaced pattern rules with suffix rules to make automake happy. 3027315 Fix compilation without xml, as suggested by Harri Pitkänen on libvoikko-devel 9e2503d do not close unopened files, handle legacy fallback reading errors 9e3739d print details of xml parsing errors 5791425 Add the very basic test suite e39e40b Throw exception if reading a zip without existing zip file. 3ebb0aa Optionalise the libraries 4ab1a3a Update for automake-1.12 and AM_PROG_AR 99f80d4 kill everyone if windows linebreaks 9459c17 Fix documentation and set for 0.2.1 release ab90ec9 free more things and stuff bca66ba Move Xml metadata parsing and storing into own file and class 686d305 Increment *raw when reading bools 5e91e84 Added utility function for iterating c strings in raw memory, use it in every branch of symbol reading 246c2b2 Fixed problems having to do with reading strings from a transducer in raw memory. 15b3184 fix the tmpdir'd version again bc2d9af Version that extracts zhfst to memory iff it fits on one throw 69ff631 Added raw memory constructors and table readers for slurping in transducers from char * 4772d28 Missing files lol 77c23cf ...and in that case the final identity states also are greater by one. 58e9512 When avoiding initial edits, there needs to be an extra inital state before we do any edits. So add the value of options.no_initial to the state range loop. 433e917 Commented out hfst-ospell-cicling from Makefile. 73f8871 Comment out missing cicling target b6a3317 "else if" instead of incorrect "if" in option handling 620392f --no-string-initial should have transitions to state 1, not 0 9f8cf3c Don't exit on empty lines 2ab0594 Don't print unhelpful warnings 4fb22d1 Silly default for minimum edit 2e58cd8 fallback_spell() ca9a3b0 --no-string-initial-correction 76c87fe Corrected eliminations (hopefully), added --minimum-edit option 713aff2 Profiled version for fsmnlp-2012 measurements§ 999ca00 Don't skip state numbers when certain options are turned off (and don't print debugging lines) e864859 Corrections to redundancy elimination 49b4394 redundancy elimination was being performed one state too late (too little) bb93e12 Add option to disable redundancy elimination 731db86 Initial support for redundancy elimination (refuse to do insertion after deletion or deletion after insertion) 83f85ce Bugfix: identity transitions were being forgotten in the last edit state 9ca92d4 Enforce @_UNKNOWN_SYMBOL_@ instead of @?@, which users didn't know about 2b001e6 Fixed one remaining UTF-8 stderr printing bug. aee7a2a Fixed bug with the newline character not being stripped from excluded symbols 92ab53a Lines that start with ## are comments fd5f560 Updated help message & added exclusion of symbols by prepending a ~ 45442ab Wrap stderr with a utf-8 codec so we can print non-ascii symbols when verbose 5c27eca Write alphabet with weights when verbose f9b426f Order of preference of alphabet definition is now configfile - commandline - transducer. If configfile gives a weight after a tab for symbols, they are used additively for all edits involving those symbols. 6629f88 Some clarifying comments 33e3c95 Rescue identities from being considered substitutions 4ce7bb0 utf-8 -decode the user-specified transitions from a conf file (so easy to forget one of these...) 42ec180 use same includedir in pc as makefile 40afbca don't declare strndup in public headers b5a6c7a make ol exceptions in hfst_ol namespace, provide stdexception style what() 3a58fbb Add verbose, quiet c821e19 * parse all of the metadata if possible * use c++ ``struct''s for metadata 3f5bb8b * Use temporary filenames by tmpsnam * Do not delete Transducers in data structures since it will segfault all enchant-based applications in dtor 5449a28 libhfst-style exception macros and some more informative messages e9c551d Document dependencies 105b6e2 mac os x fixes: * strndup * libarchive installed without headers a2b4ea5 Preliminary zhfst support d7bf3e8 Example from: http://norvig.com/spell-correct.html and similar tests a2b9d15 The test for final weighted transitions involves target_index == 1 instead of weight == INFINITE_WEIGHT 3cb5d8a Fixed bug involving bytewise casting of longs to floats (I misunderstood what static_cast really does I guess). 7ff6a13 fread returns element count, not byte count 9c01f47 duplicate definitions a433823 fix msvc problems 4f2c186 check fread return value as advised by gcc b324b05 Removed unnecessary test 9f801d0 Understand hfst3 headers, don't demand weightedness at header-reading stage 693eaea use hfstospell library name for compatibility or whatever c0a2985 MSVC fixes: * include <string> when using strinG * use boolean operators instead of aliases? 4f16033 add pkgconfig stuff 5ebbfec autoconfiscate :-) 37ad031 Add licences everywhere for release 4e7c098 make directories that do not exist 5c450f7 Install to destdir a93f59d Ignore compiled libraries. 7e0a5b4 fix missing dash in mac dylib magic 3bd799d Silently ignore if empty labels are missing b6bc407 Make dynamic or share libraries 3767ad9 Added some fancy rule autodetection 5b87174 Fixes to input format handling 9a7418c New input file syntax fd3eb19 More speed improvements a78427f Various optimizations dff1eab Critical bugfix, output now believed to be correct c227d79 Diagnostics and info about expected transducer model in test script 196ff02 More helpful help message for test script 1d1acfd Should be executable. c50bb9b Support for OTHER symbol in test script 44e6e9c Added profiler flag to debug compilation target and made demo utility exit on empty lines 2588216 Trivial cosmetic changes a059914 More header cleanup 9c551c6 Renamed variable 7035738 Misc. code cleanup and memory savings 7339e8c Free memory holding transducer data after parsing 915ef1d Some more const sprinkling 59040cc Misc. nonfunctional cleanup 5852517 Hack to make the test script handle unicode 915a6e2 Added character swaps to edit distance script. You have to enable them with the -s flag - they generate A(A-1)*D new states and twice that many transitions, where A is the size of the alphabet and D is the edit distance. Pretty expensive. Is there a better way? f77baa8 Improvements to editdist.py - see test/editdist.py --help fa61c7d Put 1.0 weights on the test generator script 2bea19e Minor enhancement to test script fb6b65f Added helpful runtime error for alphabet translation problems, updated demo utility to make use of it 31a3561 Better checking of read operations, added relevant exceptions 397a4b5 Made example formatting in README more consistent - I may have broken Tommi's commit a bit, but I think it's ok now... 2e933a8 Minor improvement to demo a7cf1f3 Static lib and fixes to xamples in readme 161d40d Added comment b383a42 Reversed previous commit, which did the opposite of what the commit message said it would. Committer will go to bed now... a4877dc Return results in reverse order by weight, ie. in order of quality (instead of the opposite) 5827928 Removed obsolete dependency on cassert d8f388e Fixed some comments 5d1d58a One more formatting fix 403d7ed Fixed typo in comment d49e37d Moved getopt dependency from ospell.h to the demo utility proper 5e1b2f1 Formatting improvements f11ed65 Introduced an exception for handling alphabet translation failure, fixed typo in help string, updated README 73647b3 Made some changes to correction-storing data structures to make sure each correction string only appears once 25207f7 Fatal bug(s) fixed, (more) correct flag diacritic functionality 57441ea New test script 82620ee Fixed typo 778d477 Fixed a braindead bug that subtly broke everything, this should make some code redundant 2a3ffcf A way to handle flag diacritics 47d35eb Trivial Makefile fix for commandline tester 564e7f2 Replaced some ungracious exits with exceptions and made small change to README d92e82c Added README and some fixes bd2c6c3 Implemented spellchecking and correction library functions; documentation, proper packaging and esp. functioning flag diacritics still to be done. d3ed0b7 Temporarily de-autotooled ospell 7344a8e Incorporated queue in speller proper dbc3bae Fixed behaviour, added weightedness scaffolding e4851c1 Corrected typo. 9df6f72 Initial commit of hfst-ospell. Basic functionality including OTHER symbol (@?@) and runtime alphabet translation is present; weighted transducers (probably to be the only option) and flag diacritic states for the mutator and lexicon forthcoming.
Generated by dwww version 1.15 on Sat Jun 15 19:13:58 CEST 2024.