f18c764ffa
Monotone-Revision: 9054022ef1ca8aeba6e34842d27d9b94ce002b89 Monotone-Author: dev-unix.inverse.qc.ca Monotone-Date: 2006-06-15T19:34:10 Monotone-Branch: ca.inverse.sogo
145 lines
4.9 KiB
Plaintext
145 lines
4.9 KiB
Plaintext
Storage Backend
|
|
===============
|
|
|
|
The storage backend implements the "low level" folder abstraction, which is
|
|
basically an arbitary "BLOB" containing some document. The feature is that
|
|
we extract "quick access" / "searchable" attributes from the document content.
|
|
|
|
Further it contains the "folder management" API, as named folders can be stored
|
|
in different databases.
|
|
Note: we need a way to tell where "new" folders should be created
|
|
Note: to sync with LDAP we need to periodically delete or archive old folders
|
|
|
|
Folders have associated a type (like 'calendar') which defines the query
|
|
attributes and serialization format.
|
|
|
|
TODO
|
|
====
|
|
- hierarchies deeper than 4 (properly filter on path in OCS)
|
|
|
|
Open Questions
|
|
==============
|
|
|
|
System-meta-data in the blob-table or in the quick-table?
|
|
- master data belongs into the blob table
|
|
- could be regular 'NSxxx' keys to differentiate meta data from
|
|
|
|
Class Hierarchy
|
|
===============
|
|
|
|
[NSObject]
|
|
OCSContext - tracking context
|
|
OCSFolder - represents a single folder
|
|
OCSFolderManager - manages folders
|
|
OCSFolderType - the mapping info for a specific folder-type
|
|
OCSFieldInfo - mapping info for one 'quick field'
|
|
OCSChannelManager - maintains EOAdaptorChannel objects
|
|
|
|
TBD:
|
|
- field 'extractor'
|
|
- field 'value' (eg array values for participants?)
|
|
- BLOB archiver/unarchiver
|
|
|
|
Defaults
|
|
========
|
|
|
|
OCSFolderInfoURL - the DB URL where the folder-info table is located
|
|
eg: http://OGo:OGo@localhost/test/folder_info
|
|
|
|
OCSFolderManagerDebugEnabled - enable folder-manager debug logs
|
|
OCSFolderManagerSQLDebugEnabled - enable folder-manager SQL gen debug logs
|
|
|
|
OCSChannelManagerDebugEnabled - enable channel debug pooling logs
|
|
OCSChannelManagerPoolDebugEnabled - debug pool handle allocation
|
|
|
|
OCSChannelExpireAge - if that age in seconds is exceeded, a channel
|
|
will be removed from the pool
|
|
OCSChannelCollectionTimer - time in seconds. each n-seconds the pool will be
|
|
checked for channels too old
|
|
|
|
[PGDebugEnabled] - enable PostgreSQL adaptor debugging
|
|
|
|
URLs
|
|
====
|
|
|
|
"Database URLs"
|
|
|
|
We use the schema:
|
|
postgresql://[user]:[password]@[host]:[port]/[dbname]/[tablename]
|
|
|
|
BLOB Formats
|
|
============
|
|
|
|
- TBD
|
|
- iCal, XML
|
|
- problem with iCal is that a complete and valid iCal file is different from
|
|
just the vevent. so we basically need to parse those files in any case.
|
|
- XML: the iCal SaxDriver reports XML events, so we can easily store that as
|
|
XML
|
|
- we need to parse the BLOB for different clients anyway (iCal != iCal ...)
|
|
- XML: we could use some XML query extension to PG in the future?
|
|
|
|
Update: we now have OCSiCalFieldExtractor
|
|
- it parses the BLOB as an iCalendar file and extracts a set of fixed
|
|
keys:
|
|
- title - plain copy of "summary"
|
|
- uid - plain copy
|
|
- startdate - date as utime
|
|
- enddate - date as utime
|
|
- participants - CNs of attendees separated by comma ", "
|
|
- location
|
|
- partmails
|
|
- sequence
|
|
- TBD: iscyclic
|
|
- TBD: isallday
|
|
- TBD: cycles - I guess the client should fetch the BLOB to resolve
|
|
- the field extractor is accessed by OCSFolder using the folderinfo:
|
|
extractor = [self->folderInfo quickExtractor];
|
|
quickRow = [extractor extractQuickFieldsFromContent:_content];
|
|
|
|
Support Tools
|
|
=============
|
|
|
|
- tools we need:
|
|
- one to recreate a quick table based on the blob table
|
|
|
|
Notes
|
|
=====
|
|
|
|
- need to use http:// URLs for connect info, until generic URLs in
|
|
libFoundation are fixed (the parses breaks on the login/password parts)
|
|
|
|
QA
|
|
==
|
|
|
|
Q: Why do we use two tables, we could store the quick columns in the blob?
|
|
==
|
|
They could be in the same table. I considered using separate tables since the
|
|
quick table is likely to be recreated now and then if BLOB indexing
|
|
requirements change.
|
|
Actually one could even use different _quick tables which share a common BLOB
|
|
table.
|
|
(a quick table is nothing more than a database index and like with DB indexes
|
|
multiple ones for different requirements can make sense).
|
|
|
|
Further it might improve caching behaviour for row based caches (the quick
|
|
table is going to be queried much more often) - not sure whether this is
|
|
relevant with PostgreSQL, probably not?
|
|
|
|
Q: Can we use a VARCHAR primary key?
|
|
==
|
|
I asked in the postgres IRC channel and apparently the performance penalty of
|
|
string primary keys isn't big.
|
|
We could also use an 'internal' int sequence in addition (might be useful for
|
|
supporting ZideLook)
|
|
Motivation: the 'iCalendar' ID is a string and usually looks like a GUID.
|
|
|
|
Q: Why using VARCHAR instead of TEXT in the BLOB?
|
|
==
|
|
To quote PostgreSQL documentation:
|
|
"There are no performance differences between these three types, apart from
|
|
the increased storage size when using the blank-padded type."
|
|
So varchar(xx) is just a large TEXT. Since we intend to store mostly small
|
|
snippets of data (tiny XML fragments), I considered VARCHAR the more
|
|
appropriate type.
|