A SYMMETRIC VERSION SPACE ALGORITHM FOR LEARNING DISJUNCTIVE STRING CONCEPTS

dc.contributor.authorBaltes, Jackyeng
dc.date.accessioned2008-05-20T23:29:57Z
dc.date.available2008-05-20T23:29:57Z
dc.date.computerscience1999-05-27eng
dc.date.issued1992-03-01eng
dc.description.abstractThe paper describes an algorithm to learn disjunctive string patterns from examples. An implementation of the algorithm learns concepts such as all C source files (*.c or *.h) in the UNIX domain. The algorithm is designed for an interactive learning model in which the number of questions to the user are minimized. Furthermore, only simple questions to the user are admissible. The representation language is restricted to a subset of regular expressions in order to reduce the complexity. The patterns in the concept grammar are extensions of Nix's gap patterns and Mo's annotated gap patterns. To learn sequences of patterns, the strings are broken up into units. The algorithm assumes that the first example is a good representative of the concept, that is, it contains all optional units. Even for limited disjunctions, the G-set of Mitchell's Candidate Elimination algorithm is infinite for domains that contain an infinite number of elements. A symmetric version space (SVS) algorithm using cover sets of the positive and negative examples computes only the most specific description that matches a set of examples and can therefore be used to learn in version spaces that make it impossible or infeasible to compute the most general descriptions. Two extra cover sets are maintained to avoid asking the user about all possible examples, a problem that arises when the number of terms in the concept is less than the maximum number of terms in a disjunction. The correctness of the SVS algorithm in version spaces with minimal generalization hierarchies, so called k-disjunctive version spaces, is shown. Furthermore, it is argued that an extended version of the SVS algorithm solves the swapping and splitting problem. The last result of the paper establishes that, though the size of the version space grows exponentially, the sample size complexity increases only linearly with the maximum length of units.eng
dc.description.notesWe are currently acquiring citations for the work deposited into this collection. We recognize the distribution rights of this item may have been assigned to another entity, other than the author(s) of the work.If you can provide the citation for this work or you think you own the distribution rights to this work please contact the Institutional Repository Administrator at digitize@ucalgary.caeng
dc.identifier.department1992-468-6eng
dc.identifier.doihttp://dx.doi.org/10.11575/PRISM/31285
dc.identifier.urihttp://hdl.handle.net/1880/46538
dc.language.isoEngeng
dc.publisher.corporateUniversity of Calgaryeng
dc.publisher.facultyScienceeng
dc.subjectComputer Scienceeng
dc.titleA SYMMETRIC VERSION SPACE ALGORITHM FOR LEARNING DISJUNCTIVE STRING CONCEPTSeng
dc.typeunknown
thesis.degree.disciplineComputer Scienceeng
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1992-468-6.pdf.gz
Size:
194.2 KB
Format:
Unknown data format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.86 KB
Format:
Plain Text
Description: