Recent Changes - Search:

Home Pages Pidgin   Azarennya (S|N) Mac Thesaurus Reference ToDo Colino Food Local

Blogs: BadIdea Rachel RIAA Cult: Clambake Infidels Fi: Arda StarTrek Trek/Wars Film: IMDB D Harry Jabootu Kyle Fun: Agony ICanHas? ObSkills Snopes Lang: ZBB Vreleksá AwkWords Omniglot Scriptorium More... Local: Maps Map MyWeb Metro (map) FC Weather GoWhere? GGWash DC Arlington Reston Beyond Bacon Pix: Deviant Places Renderosity Blender Artists Pol: Anchoress Lizards Lucianne Strata WAwakes Sci: SmallThings Darwin AntiEvo Skeptics EvC BAUT Physics /.Sci Junk Panda Pharyngula Mags AmSci NatG Space X86: OSX86 ArsTech OSNews TUAW Dev PowWeb PHP Webmaster Coding Walkers Prog: PHP JS Toolbox Unobt Compress RegExp (test) Lint SQL Cocoa Builder Dev Apple BBS Userland Faqin

Science/Tech: Engadget Thunderbolts Icecap Centauri NewSci Gizmodo co2sci ClimateDebate SciDaily Nrich NatGeog Math CreatClaims GoodBadMath

CurrentEvents: OrigSig Flamingo FlopAces ImmigProf ~J~ MyVRWC NewsGroper Pal2Pal Sanity Simon TCS Toldjah Blogs...

Tools: Calculator AsciiArt XMLVal

FunStuff: Pictures: Photobucket (eg Dubai) Videos: YouTube Subtitler

InterestingThings: LibraryThing FlashCards GoogleDocs Wowio Bubbl.us Colemak Audible PodioBooks WonderfulInfo BooksOnline AboutUs.org

Notes

(See VersionOne.)

Introduction

As of 2007-01-16, I'm thinking about writing a note application, but I've swung back to wanting to store notes as text, so they can be edited in a regular text editor.

Considerations

  • Program should be free so that I can install it in multiple places (home computer, slate, thumb drive) without the need to buy multiple licenses. (NdxCards costs $49.)
  • Data should be text so that it can be viewed and edited in a text editor. (NdxCards stores data in a proprietary database format.)
    • The Euphoria Database System (EDS) is tempting, and I may use it yet, but no file format in the world is better supported than plain text.
  • I need to write the program. No program I've found has all the attributes I want.
  • Code should be Euphoria. Euphoria (docs) has simple English-like syntax, is easy to debug, and can produce standalone executable files.
    • PHP requires a website or a setup like WAMP, and it cannot produce executable files (as far as I know).
    • C and C++ require meticulous tracking of memory allocation and are much harder to debug than Euphoria.
  • File should contain many notes because keeping each note in its own file wastes disk space.
  • Notes should be separated by a divider line so that the notes themselves can be formatted with newlines, so that they are readable in a text editor.
    • An alternative I rejected was to store each note as a single line, with newlines encoded as "\n". Not only would notes be harder to read in a text editor, but this would require my program to convert newline characters whenever the file was loaded or saved.
  • Divider line needs to be recognizable in a text editor, and not confusable with something else. If the first twelve characters of a line are #<{[NOTE]}>#, then the line is a divider line (and the rest of the line is ignored).
    • I was going to allow the divider string to appear up to six characters away from the start of the line, so that a source-code file could contain (commented out) divider lines and thus be used directly as a note file. But how does the application know which comment characters to put in front of a divider line when the user creates a new note? It would be better to have the app export a note file to a text file, i.e., everything is written out except the divider lines.
  • Plain-text fields should be mixable with Wiki text. I'm using ((K:a keyword)) to add "a keyword" as a keyword to a note, and ((name:value)) to add to the note a field of type "name", set to type "value". Neither PmWiki nor Schtuff takes this double-parenthesis format as Wiki formatting, so I expect this will be OK for other Wiki markups. In addition, Euphoria's wildcard_match() function can find a field using the pattern "((*:*))".

Macros and intermediate language (IL)

IMPORTANT UPDATE!!! I am reconsidering stacks. Maybe a routine needs instead its own sequence. The first item is TRUE to return it to its caller and FALSE to return nothing. The second item can be anything -- an atom, a sequence, whatever should be appended to the caller's own sequence when the routine returns (if the first item is TRUE).

So when you specify a literal string, it goes into a "stack" reserved for the next word you call. (This next word could be something like "stash" to put the contents of this "stack" into the "stack" of the current routine instead, or "dup" to copy the item without removing it from the stack intended for the word to be called.)

  • Program should be programmable. It should be possible to record one's actions and replay them later. So the program will need a macro recorder and a macro interpreter -- and thus a macro language.
  • Macro language should be Forth-like. This yields the simplest syntax to interpret, as each individual word does something, so that words do not need to be combined into phrases before they make sense, and there is less need for syntax checking. This implies a stack, and basic commands to manipulate the stack (DUP, DROP, SWAP).
  • IL is a sequence of word IDs and literal strings. A word ID is an atom -- an index into a Forth-style "dictionary" of word definitions; the interpreter interprets a word ID as a call to the IL code stored in the dictionary for that word. A literal string is a sequence -- a string to be pushed onto a string stack, just like literal numbers in a Forth program.
  • IL should not use routine_ids. An atom in a sequence of IL is always an index into the word dictionary. Only built-in words -- words whose index falls below that of the first user-defined index -- should be defined with a routine_id. (Thus the interpreter routine that "executes" an IL atom does this: if word_ID < first_user_ID then call_proc (dictionary [DEFINITIONS] [word_ID], {}) return end if.)
  • Dialog boxes should return IL. A dialog box should spit out a sequence of IL, which can then be attached to a macro or interpreted immediately.
  • Interpreter should be simple. It does only the following:
    • Send a routine_id to call_proc() (if the word ID is in the range of built-in words).
    • Push the current IL sequence and position onto a call stack, and start executing another IL sequence.
    • Pop an IL sequence and position from the call stack, and start executing from that position within the sequence.
    • Push a string onto the string stack.
  • Program should be written around the IL interpreter.
  • Program needs a compiler. The compiler would convert text words into a sequence of IL (and text strings in doublequotes, with embedded double doublequotes, into literal strings). I'll need this to run macros stored in a file as text. (This should be relatively simple if each word ID has one unique text equivalent and each text word has one unique word ID.)
  • Program needs a decompiler. The decompiler would convert IL into text (and literal strings into text strings, enclosed in doublequotes and with embedded doublequotes doubled). Since a macro would be recorded as IL, I'd need this to save the macro as text or to present the macro for editing.
  • Program needs a debugger. A programming environment this different from the conventional C- or Basic-style language cries out for a debugger. Users will want to execute macros a word at a time and see the results on the stack, on the dictionary, and on the state of the application. They'll want to set breakpoints, to run code until a given condition is true, skip over words, rewind code (and roll back changes made), and probably a few other things.
  • QUESTION: Should users be allowed to pop things from the stack directly? If an IL word leaves the stack at a certain depth, it should do so no matter what; otherwise you might have an item on the stack when things went OK but not when an error occurred -- leaving the caller to try to figure out what went on and to check the stack whenever the word is called. It might be possible to keep words like DUP, DROP, and SWAP hidden and used only by built-in routines. Maybe it's time to explore whether we can get by with a one-item stack -- one you fill with your own string, or one that gets filled with a default value. Instead of a data stack per se, an elaboration of the call stack: Each routine gets ONE data level, but this can be a sequence of sequences. When the routine returns, this sequence is appended to the data level of the caller.

Queries

How are queries done? If the IL interpreter is a fundamental part of the application, then maybe queries are macros (IL sequences) -- words on the dictionary. A query word would just test the current note and set a built-in flag to TRUE if the note matches the criteria or FALSE if not.

Let's say we have a query like this:

  • Note has keyword "system": ((K:system))
  • Note has text "drove off": drove off

So the resulting macro looks like this:

"((K:system))" match? ifso "drove off" match?

First, a literal string is pushed onto the stack. Next, the word match? pops the stack and sets the flag to TRUE if the string is found in the current note, FALSE if not. Finally, ifso returns to the caller if the flag is FALSE. (There is no endif construction. If there is to be an "if" that does not return to the caller, I'll use the word then for it.)

A query with more options calls special routines to set or clear these options. For example:

"car" ignore_case start_word match? ifso "drove off" whole_words match?

Organization

The program should take its list of subjects from the names of the files in the data folder, e.g., "Reminders.n", "To-do.n", "Contacts.n", etc.

Interface

Keyboard

  • PgUp/PgDn to move between notes
  • Ctrl+Insert to create new note
  • Ctrl+Delete to delete current note
  • Ctrl+F to find a note
  • F3 to find next note with current criteria

Euphoria

If I write the software in Euphoria, then I have the source code and can add features later -- like macros.

File I/O

File operations are easy.

Reading notes from a file

This should be a procedure that sets ilFlag to TRUE if the file was read successfully and FALSE if not. The file is read in as a sequence of notes, which in turn is placed, along with the path to the file, into another sequence which in turn is saved into the "notes" sequence, which is thus a gathering of all open files.

To read a file, open() it and use gets() to read each line (including final "\n"):

constant div = "#<{[NOTE]}>#"
constant divlen = length(div)

sequence ilStack, notes
integer ilFlag

procedure fileOpen()
-- get filename from ilStack
-- set ilFlag = TRUE on success, FALSE otherwise

	sequence filename, buffer, file
	object line
	integer fn, max

	filename = popStack()
	fn = open(filename, "r")
	if fn = -1 then
		ilFlag = FALSE
		return
	end if

	buffer = {}
	while 1 do
		line = gets(fn)
		if atom(line) then exit end if
		max = length(line)
		if max > divlen then max = divlen end if
		if equal(div, line[1..max]) then
			file &= {buffer}
			buffer = ""
		else
			buffer &= line
		end if
		buffer = append(buffer, line)
	end while
	if not equal(buffer, "") then
		file &= {buffer}
	end if
	close(fn)
	notes &= {filename, file}
	-- current file = notes[$]
	ilFlag = TRUE
end procedure

Writing notes to a file

To save the file, you just reverse the process: After each note, if there is another note, you append the note-boundary line.

sequence line, divider
integer fn

fn = open("myfile.txt", "r")
if fn = -1 then
    puts(1, "Couldn't open myfile.txt\n")
    abort(1)
end if

divider = div & "\n"
for i = 1 to length(notes) do
    line = notes[i]
    if not equal(line, "") then
        puts(fn, divider)
        puts(fn, line)
        if line[$] != '\n' then
            puts(fn, "\n")
        end if
    end if
end for
close(fn)

Finding text

Use wildcard_match() (include wildcard.e) to find text (also use upper() or lower() for case-insensitive wildcard searches). Note that an asterisk matches zero or more characters and a question mark matches any single character.

A query is a series of patterns (a sequence of sequences) to match against a sequence (note). Each pattern sequence is used as a filter. So to match a note against a query, you compare each pattern in the current filter to the text of the note using either match() or wildcard_match(), and if none of these calls return zero, then you have a match.

Setting up the dictionary

Store words and definitions in separate sequences for faster access:

sequence words, defs
words = {
    "dup"
    , "drop"
    , "swap"
    -- and so on
}
defs = {
    routine_id("dup")
    , routine_id("drop")
    , routine_id("swap")
    -- and so on
}

Macro language

Yes, I'll eventually want to automate the application. This means working out a macro language.

I'll want to store macros as text, so that they can be read. However, the macro interpreter will translate this text into a sequence of routine_ids and then execute the routine_ids.

I'll want to record macros. Each keystroke that activates a command or sets a variable or option will need to be translated into a "word" (the name of a procedure).

Forth-like setup

The language is stack-based, so it looks "backwards" to some people, but there is no need to lump words into phrases before they make sense. Thus, each word is a command that does something.

The language consists only of words and literal strings. Lines of the language look like this:

ignore_case use_wildcards "s*thin* happen?" find
"New topic.n" saveas -- saves current file as "New topic"

Handling words

A word corresponds to an entry in a "dictionary." Each entry in the dictionary contains two sequences: (1) the word itself, and (2) the word's "code." The code consists of atoms (routine_ids for internal routines that are part of the application, or indexes to other words in the dictionary) and sequences (literal strings).

When the interpreter encounters a word at runtime, it runs the code stored in that word's dictionary entry.

Handling literal strings

A word that begins with a doublequote triggers the interpreter's "quote mode," which causes the interpreter to put the following words into one long string that will go on the top of the stack -- until a single doublequote is found, which terminates quote mode. (A double doublequote inside a string -- "" -- is treated as part of the string and is converted to a single doublequote.)

When the interpreter encounters a literal string at runtime, it pushes the string onto the stack. (Commands like find and saveas typically remove one string from the stack.)

Compiling calls to words

-- Application initialization

sequence word_definitions
word_definitions = {
    routine_id('fileOpen'),
    routine_id('fileSave'),
    routine_id('filePrint'),
    routine_id('editUndo')
    -- etc.
}
constant userIDs = length(word_definitions) + 1

----------------------------------------

-- Macro interpreter loop

if id < userIDs then
    call_proc(word_definitions[id], {})
else
    call_word(id)
end if

Etc.

The macro recorder typically gathers printable keystrokes into strings, which are passed to an insert command. Control keystrokes that open a dialog box, such as Ctrl+F for Find, put the recorder into another mode: It waits for the user to click OK before it gathers up all the commands and makes them part of the macro. If the user clicks Cancel, then the commands are discarded.

Also note the implications here for gathering things into larger objects. If a query consists of patterns, each of which consists of a string and some options, then you define the strings and options first, then make them a pattern. (Patterns go onto their own stack.) After you've defined all of your patterns, you define the query, using all of the patterns on the pattern stack.

Dictionary

The macro language has a built-in dictionary and a user-defined dictionary. When you record a macro, the macro automatically gets a name, with something like macro_1 def, which tells the recorder to define a new macro and to give it the name "macro_1". (The "def" word says to take the name in the "not-found" string and use it as the name of the new current macro. If a word is not found in the dictionary, the word is stored in a "not-found" string. [If the word were stored on the stack, that would encourage some hard-to-debug code involving the wrong items being in the wrong position on the stack, i.e., a word you expect to invoke a macro instead ends up on the stack, where it will confuse the heck out of the next word that needs to pop something.])

The dictionary itself is a sequence of dictionary entries. Each entry has two parts: a sequence (the procedure name, or "word") and an integer (a routine_id for the corresponding procedure, or an index into another sequence of other routine_ids.

So the interpreter takes a macro in text form and generates a sequence that the interpreter interprets. If an element of the sequence is an atom, it is either a routine_id or the ID of another interpreter sequence. If the element is another (sub)sequence, then it is a literal string to be pushed onto the stack.

Macro file

Macros should be saved to a file, in text form, as notes. So the system might have a "Macros.n" file. (An interpreter needs to be able to make sense of the contents of a note, and to find macros quickly, so a macro needs a special keyword, such as [/@macro/].

Also, the default behavior of the application is to summarize each note with the note's first line of printable text, so the recorder saves the macro's name, the word, as the first line ("def" goes on the line below).

Edit - History - Print - Recent Changes - Search
Page last modified on January 17, 2007, at 06:00 PM