Brainstorming Requirements, Features, etc.
From MWCSWiki
I was doing some brainstorming (mostly trying to commit to writing all of things that we have discussed so far) and I came up with this tentative and incomplete list of features, requirements, and functionality.
Please fill in the gaps, add more sections, and comment on stuff I put down. Then we can use this list later in meetings and make sure we all agree on what we need to do.
I'm hoping we can collaborate on this list and eventually it will evolve into a definitive requirements and design document. :)
Great work, Rebecca! (Stephen's comments in blue)
What are our goals?
Goals:
- provide user friendly access to both low and high skill-level users
- provide an intuitive interface for the insertion of new data (allow bulk import capability as well) Bulk import capability sounds awesome, although my experience says that this typically has to be customized for a data source (unless there is some standard format for representing linguistic data?)
- provide the ability to search, analyze, and view data in ways that linguists will find most helpful I agree completely, and add this caveat: some of these "ways" will be things that linguists cannot currently envision, since there is no tool that yet allows them to do it. Often experimental technology opens up a new area of research that those who will ultimately benefit didn't anticipate. When Edison invented the light bulb, he did it in response to people who simply wanted a better gas lamp: they didn't even have a "light bulb" on their radar screen. Hence the new technology initiated a paradigm shift.
- export data in a usable and printable format (export as Word, PDF, Excel)
- allow users to store and listen to sound files
- allow users to store image files (of written text) I hadn't thought of this.
- allow users some way to type in their script (support through Unicode)
- provide correct modeling of the language (relationships between words, phrases, and sample sentences) in the database
- provide a sense of community, a networking device between native speakers and linguists who are interested in/speak a language
Who are our users and what will they be doing?
Native Speakers
will mostly be inserting new data
Linguists
will be inserting new data, most likely in a bulk format (not sure if this is typically bulk or not? What about manually entering notes from field work?), and will be using as a reference and analytical tool.
System Administrators
manage user accounts, monitor data, delete clearly invalid data, prevent inappropriate use of the site, monitor and respond to errors and user requests for information or support. I hadn't thought of this. I'm assuming this will mostly be in the future.
What are our Sub-Systems? (or ways to categorize different parts of our system)
Interface
?
Storage
?
Error-handler
we need to:
- handle errors gracefully
- give ourselves meaningful output for debugging
- set up email notification of errors (important when the app is online and in use)
Account Management and Activity Recorder:
we need to:
- store, create, edit, and delete user accounts (store info about the user)
- monitor user account activity (which accounts are inactive, etc)
- know which user account made which entries
Backup
we need to:
- backup the system periodically and store backups
What are our Features?
Basic Features (definitely in the prototype):
- create and manage user accounts
- make new entries
- search
- search based on lexeme
- search based on gloss / English equivalent
- search based on pronunciation
- search based on "pronunciation component within a lexeme" (or whatever Fallon was telling Will about...)
- store sample sentences, link each lexeme in the sample sentence to an entry for that lexeme
- navigate between sample sentences, individual lexemes, and related lexemes
- allow a method for input, storage and display of non-Roman characters
- storing, retrieving, and displaying disparate data in a coherent way
Non-Basic Features (not strictly necessary for the prototype, could wait until next semester):
- store sound files, allow user to listen to sound files
- provide linguists a tool for bulk entry of research notes
- provide a tool for exporting data in an easily usable format (maybe excel?) XML would be awesome, though I don't know if Excel can import that (yet.) If not, I think just comma-separated would be easy and Excel-compatible.
- use WordNet to prompt user to establish synonymous relationships between words (part of me feels like this IS essential in the prototype. What do you guys think?) Something tells me "yes," although if something has to go, this is probably it, at least for the fall.
- forum, for discussing and debating linguistic topics, for discussing individual entries, and for general chat and networking (if we find an interest for that) I think some of this is beyond the scope of what we want to demonstrate for this topic. "Discussing individual entries," however, is worth thinking about. That may end up being a vital part of what "storing/retrieving/displaying disparate data in a coherent way" entails.
What kind of data should we store about each "entry"?
we should remember and display some info about who made the entry (for example the region where the native speaker grew up could be meaningful to a linguist viewing the entry). We might want to support a "contact this person" feature, by which linguists could contact other linguists and native speakers.
we should allow users to make the following relationships:
- synonyms (Does this mean "pretty much exact synonyms?" Or "related words?" I think linguists (and people in general) often think in terms of the latter, and it's this phenomenon that I'm mulling over.)
- meronyms
- opposites
- "is derived from"
- "is a root of"
- "is a form of"
we should allow users to specify the following characteristics:
- slang
- informal
- formal
- acronym
- phrase
- idiom (This brings to mind the distinction between "lexeme" and "lexical unit" on the linguistics information page you posted. Are we storing just lexemes, or also multi-word lexical units? The latter is probably ideal, in which case I guess we're building a "collaborative lexis" instead of a "collaborative lexicon.")
Notes from 9/16 meeting:
- Need to verify with Fallon (or whoever) that the categories we have identified ("slang," "informal," etc. are the right ones.) "Technical?" "Offensive?" "Commonly used?" "Archaic?"
- Goal for next Thursday: we meet, and at the end of that meeting, we have agreed on a set of use cases.

