From ashmansk@hep.uchicago.edu Fri Mar 16 23:55:58 2001 Date: Mon, 11 Dec 2000 13:53:37 -0600 (CST) From: Bill Ashmanskas To: Mark Lancaster Cc: Marjorie Shapiro , Jim Amundson , wolbers@fnal.gov, watts@physics.rutgers.edu, r.stdenis@physics.gla.ac.uk, stefano.belforte@ts.infn.it, rharris@fnal.gov, ksmcf@fnal.gov, sexton@fnal.gov, Rob Snihur Subject: Re: Offsite Database Export Review On Sun, 10 Dec 2000, Mark Lancaster wrote: Thanks again for filling me in. I have several follow-ups, one or two of which are I think pertinent to database export and the rest of which are related more generally to DB, not DB exporting, and thus you are well within your rights just to "pass." Your COTWPO table seems to have a row per version per wire, so reading one complete alignment requires reading O(nwires) small rows, rather than one big row. Is it safe for me to assume that this comes with no particular cost? (I think probably the benchmark I quoted--the read times for the complete set of CTC constants, stored in whatever way you plan to store the COT constants--would answer this.) Again you are free to dismiss this question as, "not pertinent for this week's review." > For now I am not convined we should get hung up on the details of access > times and storage overheads - with some optimisation these will all be > acceptable. We should rather I think concentrate on how easy it will be to > administer the exports. I am still curious to see the results of a simple extension of the performance test that you already did, which is to write 1000 rows in run order, then read back those rows in random order, for both database candidates. I consider this semi-pertinent, as a comparison of the two databases' performance. On my two points "first access vs subsequent access" and "frequently used vs infrequently used" constants, I think that maybe twice I managed not to ask the question that I really wanted to ask (though the answers were useful anyway). I'm wondering if some of the constants, such as SVX pedestals, will be deemed by consensus to be so infrequently needed by remote users (not necessarily node by node), that they would not be exported, and then one could fall back to some kind of direct socket-based access to the FNAL database. I'm not so concerned about having to wait until the next day to read constants from FNAL. If the pure-network fall-back mechanism exists for a subset of the DB, though, then maybe a process-by-process switch could be thrown (getenv, talk-to, etc.), allowing one to function with no exported database. I didn't necessarily mean that one wants to cook up a separate export list for each remote instititution; that sounds like a big hassle for FNAL-based administrators. I'm wondering if direct TCP/IP socket access to some FNAL DB server is a straightforward upgrade/downgrade of the current plan, since it seems like a really simple fallback option for new nodes, temporary nodes, transient export failures, and who knows what other unforseen scenarios. > True CID does map onto comp/attr/run/version and can then via SET_RUN_MAPS > and USED_SETS be mapped to a higher level identifier e.g. beam-fix-hack-1. Great, this sounds like exactly the solution we need for the worst of the Run I DB's deficiencies, the what-version-to-use problem. I still have no concept, though, of how one specifies, for a given job, whether or not to use "beam-fix-hack-1." It would seem nicer to do it by talking to a central DB manager, rather than requiring each piece of code that accesses the DB to have a talk-to parameter allowing one to select the set of constants to be used. But this is not at all pertinent to tomorrow's review. > The only space/compressions/access issues are with SVXPED - I think with > everything else you buy enough disk and don't worry about it - it will not > effect your analysis. Clearly there is some padding with storing data in > tables (key, index overhead) compared with a binary C-struct file. We get > some padding at the expense of increased funtionality and a cure for run-1 > deficiencies. I more or less accept this as a general argument, but I won't stop worrying about it until I see the actual numbers. If I am ever a reviewer for general database issues, I will want a quantitative performance (time and space) analysis. But I probably have no right to expect an answer to this question in the context of this meeting (unless there is a gross difference in time/space performance between the two remote database technologies)--again, just curious. Thanks again for all the detailed information. -Bill