The complete survey results may be downloaded from here.
]]>Also see Amazon SimpleDB.
]]>Everyone talks about the desert island applications, tools, pictures, and music that they carry with them on their USB stick, but not many people consider using a USB stick to communicate information to first responders. In addition to including an ICE(In Case of Emergency) text document for first responders, I also run XAMPP/MediaWiki directly from the USB stick to manage my daily brain-dumps.
In Case Of Emergency
So why include first responder ICE data on my USB stick when I don’t have any medical conditions to speak of? Some of my personal activities (bicycle racing and training) are dangerous, and I prefer to carry a product like RoadID instead of a wallet for these activities. I want emergency personnel to have immediate access to my emergency information in the event of an accident. The information includes: full name, all emergency contacts, doctors, blood type, conditions, regular medications, allergies, basic health insurance info (name and phone only no ids). I will have a USB stick affixed to my RoadID whenever I head out of the house, and the RoadID includes a line that says “EMERGENCY SEE USB”. A simple text document named “In Case Of Emergency.txt” on my USB stick will provide quick access to my emergency information. There are USB products for this purpose of course, but there is no reason that a DIY labeled “In Case Of Emergency” USB is any less effective/visible with a first responder. In fact, I asked the Police Chief of my little Pennsylvania Borough about first responders and USB sticks, and hear is his response:
“Currently, there is no protocol to look for USB flash drives on an injured, unconscious, and/or Alzheimer type persons. The only thing that first responders have been asked to look for in these situations are the commercially available Medic Alert bracelets and or necklaces.
Through the years as an Assistant Fire Chief, EMT/Paramedic Assistant, and Police Officer, I’ve received a lot of info on this issue. As of yet, we have not been told to look for USB flash drives. We do look for a wallet, purse, mobile phone or the Road ID type of info shown on the site that you provided etc. and then go through it to check for any info such as medical issues and contact information.
…
I am certain if a person is in the unfortunate position that you described, and we found a USB flash drive marked as you indicated, we would quickly plug it into our car computer to see what we could find.”
While I appreciate the willingness of RoadID, Google, Microsoft Corp, and Revolution Health Group, LLC to manage my personal emergency information online and possibly make it available to first responders, I plan on keeping this information as close to me as possible. It is interesting to note that only 14% of medical practices keep records electronically, so at some point my encrypted Apache Derby database will expand to include some scanned medical records as well.
Knowledge Portfolio
Let’s move on to brain-dumps. While I invest regularly in my knowledge portfolio, I have not done the greatest job of electronically centralizing all those wonderful little nuggets of continuous self-improvement. Problem solved – “Wiki on a Stick”. I downloaded two compressed files (XAMMP Lite and MediaWiki), decompressed both files, and copied the contents to my USB stick. Some simple property settings using my browser, and my wiki was alive and serving all of my desktops.
Household Inventory
The most important piece of this discussion is managing my life’s inventory in an embedded Apache Derby database. This is the data that will help me rebuild my world in the event of some personal disaster such as a fire, tornado, or earthquake. I’m talking about storing scanned documents that include the following information:
• all account information and recent bills
• paystubs
• important receipts and canceled checks
• birth/marriage certificates
• deeds
• tax papers
• insurance policies
• stock/bond certificates
• professional appraisals.
• photographs of possessions that include a member of the family holding the item.
• photographs of the house, every room, every closet, basement, garage, and automobiles
Testing Apache Derby Encryption
Is my data safe in an Apache Derby BLOB? To establish an encryption test baseline, I inserted scanned documents into an unencrypted Derby database. I then made a DD image of the USB stick using FTK Imager Lite. Finally, using scalpel I was able to easily carve out the scanned PDF documents stored in Apache Derby in under a minute. I expected that the scanned PDF documents were easily retrievable from an unencrypted Apache Derby database regardless of database user authentication settings. I repeated the same process for an encrypted Apache Derby database, and was unable to carve out the scanned documents using scalpel. I now have a warm fuzzy that my life’s inventory is safe.
I'll post the JDBC application source shortly.
]]>My digital wallet data security is job one. In addition to the typical user authentication database access restrictions, Apache Derby provides complete encryption of on-disk data. Everything is encrypted: tables, indexes, transaction log, table data, temporary files, system metadata, and so forth. Out of the box encryption strength is 56-bit DES but this is easily switched to another encryption algorithm. I do plan on periodically verifying/validating physical data file security with FTK Imager Lite and WinHex, or some other combination of cyber forensics tools. Come to think about it, the default 56-bit DES is probably enough considering that I regularly entrust waitrons with my credit card information, and retail staff with my driver’s license information for check verification purposes.
Apache Derby is a fully functional RDBMS written entirely in Java. It runs in any JVM (version 1.4 higher). For now, I plan on using the Apache Derby ij JDBC application with Linux and Windows scripting to manage my digital wallet data. I may also incorporate the use of the SQuirrel SQL universal client. I haven’t had issues with either on my openSUSE or Windows PCs. In my next post, we’ll explore this project further.
SELECT tabname, remarks FROM syscat.tables where tabschema=’DEV’
Both Oracle and Microsoft do not store the remarks inline with the other metadata in the system catalog. You access Oracle remarks using the system views USER_TAB_COMMENTS and USER_COL_COMMENTS. In Microsoft SQL Server, you access remarks through the sys.extended_properties catalog view. Microsoft SQL Server does not limit you to applying only one piece of explanatory information per schema object; however, the stored procedures to create and manipulate schema object remarks are laborious when compared to the other vendors.
Do yourself a favor and spend some time reviewing the schema object documentation capabilities of your DBMS. Remember the remarks that you supply to the system catalog are available to the team when they are working with the physical database. The team will love you for it, especially when you are not around.
]]>Consider the following DDL:
CREATE TABLE Test (
col1 UNIQUEIDENTIFIER NOT NULL PRIMARY KEY CLUSTERED DEFAULT NEWSEQUENTIALID(),
col2 VARCHAR(100) NOT NULL
)
GO
CREATE PROCEDURE TestProc
@list XML
AS
BEGIN
SELECT Test.col1, Test.col2
FROM @list.nodes ( ‘/List/Value’ ) List(col1)
INNER JOIN Test ON Test.col1 = List.col1.value (‘@col1’, ‘UNIQUEIDENTIFIER’ )
END
GO
and the following Table data (in INSERT order):
col1 col2
------------------------------------ ---------------------------------
D443AD7A-9293-DC11-9042-00065B83FA16 One
D543AD7A-9293-DC11-9042-00065B83FA16 Two
D643AD7A-9293-DC11-9042-00065B83FA16 Three
D743AD7A-9293-DC11-9042-00065B83FA16 Four
D843AD7A-9293-DC11-9042-00065B83FA16 Five
D943AD7A-9293-DC11-9042-00065B83FA16 Six
DA43AD7A-9293-DC11-9042-00065B83FA16 Seven
DB43AD7A-9293-DC11-9042-00065B83FA16 Eight
DC43AD7A-9293-DC11-9042-00065B83FA16 Nine
DD43AD7A-9293-DC11-9042-00065B83FA16 Ten
and finally the following batch:
DECLARE @list XML
SET @list = '<List>
<Value col1="D843AD7A-9293-DC11-9042-00065B83FA16" />
<Value col1="D943AD7A-9293-DC11-9042-00065B83FA16" />
<Value col1="D843AD7A-9293-DC11-9042-00065B83FA16" />
<Value col1="D943AD7A-9293-DC11-9042-00065B83FA16" />
<Value col1="D843AD7A-9293-DC11-9042-00065B83FA16" />
<Value col1="D943AD7A-9293-DC11-9042-00065B83FA16" />
</List>'
EXEC TestProc @list
SELECT t.col1, t.col2
FROM @list.nodes('/List/Value') List(col1)
INNER JOIN Test t ON t.col1 = List.col1.value('@col1','uniqueidentifier')
While the SELECT t.col1, t.col2… produced the desired resultset, EXEC TestProc @list produced a resultset not in XML order because the optimizer chose a plan that specified a Merge Join algorithm.
col1 col2
------------------------------------ ---------------------------------
D843AD7A-9293-DC11-9042-00065B83FA16 Five
D843AD7A-9293-DC11-9042-00065B83FA16 Five
D843AD7A-9293-DC11-9042-00065B83FA16 Five
D943AD7A-9293-DC11-9042-00065B83FA16 Six
D943AD7A-9293-DC11-9042-00065B83FA16 Six
D943AD7A-9293-DC11-9042-00065B83FA16 Six
To preserve the XML document order within our stored procedure, we can simply use the SQL Server 2005 ranking functions. So TestProc now looks like this:
ALTER PROCEDURE TestProc
@list XML
AS
BEGIN
SELECT t.col1, t.col2
FROM
(SELECT ROW_NUMBER() OVER (ORDER BY preserveCount) AS rowNumber, PreserveOrder.col1
FROM
(SELECT List.col1.value('@col1','uniqueidentifier') AS col1
, 0 AS preserveCount
FROM @list.nodes('/List/Value') List(col1)) PreserveOrder
) OrderedList
INNER JOIN Test t ON t.col1 = OrderedList.col1
ORDER BY OrderedList.rowNumber ASC;
END
EXEC TestProc @list will now produce the proper results.
col1 col2
------------------------------------ ---------------------------------
D843AD7A-9293-DC11-9042-00065B83FA16 Five
D943AD7A-9293-DC11-9042-00065B83FA16 Six
D843AD7A-9293-DC11-9042-00065B83FA16 Five
D943AD7A-9293-DC11-9042-00065B83FA16 Six
D843AD7A-9293-DC11-9042-00065B83FA16 Five
D943AD7A-9293-DC11-9042-00065B83FA16 Six
For every database in our Mainline codeline folder structure there are two child folders: Previous and Sprint. The Previous folder contains DDL files for every database schema object, including all BCP domain/default data files, necessary to restore a reference image of the last released database schema. In the folder structure below, the Previous folder contents would construct a 2007.2 equivalent database schema for MyDatabase2. The Sprint folder contains the DDL and DML necessary for active development work. In older codelines, the Sprint folder provides a quick snapshot of what happened in the database schema for that particular release.
Project
2007.2
Mainline
Business Logic
Unit Tests
Database
bin
Reference
MyDatabase1
MyDatabase2
Previous
Tables
StoredProcedures
…
Sprint
Sprint database schema object updates are applied using this PowerShell/SMO script and controlled via an XML manifest. Every DDL or DML modification is listed in the manifest. A SprintUpdate XML Element in the manifest identifies one modification, and is decorated with appropriate sprint and sprint backlog information XML attributes; furthermore, there is an XML attribute to control whether or not the modification is actually applied to the specified database during script execution. The manifest will contain DDL and DML database schema modifications for all Sprints until the Release Sprint. When a Release Sprint completes, the Mainline codeline is renamed to the appropriate release version id and a new Mainline codeline is constructed.
]]>Invoking the scriptdatabase.ps1 script:
C:\powershell .\scriptdatabase.ps1 MyInstanceName MyDatabaseName C:\Project\MainLine\Database\MyDatabaseName\Previous
where, "MyInstanceName" is the name of a Microsoft SQL Server Instance, "MyDatabaseName" is the name of an existing database on the specified instance, and "C:\temp" is the root path for the codeline folders. In other words the script will create/populate C:\Project\MainLine\Database\MyDatabaseName\Previous\StoredProcedures, C:\Project\MainLine\Database\MyDatabaseName\Previous\ForeignKeys, C:\Project\MainLine\Database\MyDatabaseName\Previous\Tables, etc.
In the next post I will explain how Sprint updates are handled.
[The source code is also available as one zip file: ddj071022hemdal.zip.]
]]>1. providing easy restoration of previous versions
2. repeatable process that supports autonomous work
3. allow concurrent updates to the database schema objects
4. synchronization with the application code to ensure stable builds
5. storage of production regression test data
6. storage of default domain data
7. quick deployment of a specific version of a database schema through reference objects
8. a quick view of what happened to the database schema for a particular release.
Project
2007.1.0
2007.1.1
2007.1.2
2007.2
Mainline
Business Logic
Unit Tests
Database
Library
Reference
MyDatabase1
MyDatabase2
Previous
Tables
Foreign Keys
Defaults
DML Triggers
DDL Triggers
Check Constraints
Functions
XML Schemas
Stored Procedures
Views
Options
Security
Indexes
Synonyms
Data
Sprint
The Database folder structure in each codeline above is similar in format. The folder contains a Library folder, a Reference Folder, and a folder for each database in the codeline. The Library folder contains the PowerShell scripts and other files necessary for providing a repeatable database schema restoration process. The Reference folder contains pre-built implementation(s) of each database contained in the codeline. The pre-built database(s) do not have any Sprint updates applied, but they are populated with data. The Sprint folder contains the DDL and DML scripts for active development work. Use descriptive script names for DDL and DML scripts to provide a quick view of database schema modifications in previous releases. I also include a reference to the Sprint Backlog Item in the name for traceability.
I will post PowerShell/SMO scripts to support this codeline structure shortly.
]]>For this post, I just wanted to expose a little PowerShell script to demonstrate how amazingly powerful this technology is. The following code enumerates the schema objects for a user-supplied Microsoft SQL Server database, and generates the corresponding T-SQL create script files for objects whose names match a user-supplied regular expression. I chose to enumerate the objects based upon the object name, but I have left script comments in to specify the enumeration using the schema object type (i.e. Stored Procedure, Table, Foreign Key, etc.). To create T-SQL create script files for all database schema objects simply supply ".*" as the regular expression to match.
function ScriptSqlObject([String]$obj,[string]$objType,[string]$targetPath)
{
[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.SMO") | out-null
$server = new-object ( 'Microsoft.SqlServer.Management.Smo.Server')$serverName
$db = $server.Databases[$dbName]
$o = new-object ( 'Microsoft.SqlServer.Management.Smo.Scripter') ($server)
$o.Options.WithDependencies = $false
switch ( $objType.Trim() )
{
"U" { $actualObject = $db.Tables[$obj]; $extension = ".tbl"; break; }
"P" { $actualObject = $db.StoredProcedures[$obj]; $extension = ".sp"; break; }
Default { return; }
}
$f = [System.IO.Path]::Combine($targetPath, $actualObject.Name + $extension)
if ( [System.IO.File]::Exists($f) -eq $true )
{
[System.IO.File]::Delete($f)
}
$o.Options.FileName = $f
$o.Options.AppendToFile = $true
$o.Options.ScriptDrops = $true
$o.Options.IncludeIfNotExists = $true
# script the drop
$o.Script($actualObject.Urn)
$o.Options.DriPrimaryKey = $true
$o.Options.ScriptDrops = $false
$o.Options.IncludeIfNotExists = $false
# script the create
$o.Script($actualObject.Urn)
}
$cn = new-object System.Data.SqlClient.SqlConnection
$cn.ConnectionString = "Server=$serverName;Database=$dbName;Integrated
Security=True"
$cmd = new-object System.Data.SqlClient.SqlCommand $cmd.CommandText = "SELECT * FROM sys.objects"
$cmd.Connection = $cn
$a = new-object System.Data.SqlClient.SqlDataAdapter
$a.SelectCommand = $cmd
$ds = new-object System.Data.DataSet
$a.Fill($ds)
$cn.Close()
$names = @{}
$ds.Tables[0] | %{if($_.Name -match [regex]$objectPattern) { $names[$_.Name] = $_.Type } }
# to enumerate all table objects specify “U\s” as input to $objectPattern
#$ds.Tables[0] | %{if($_.Type -match [regex]$objectPattern) { $names[$_.Name] = $_.Type } }
foreach ( $key in $names.Keys )
{
ScriptSqlObject $element $names[$key] $outputPath
}
#$sr=new-object System.IO.StreamReader("C:\")
#$script=sr.ReadToEnd
#$db.ExecuteNonQuery($script)
While the SQLite Library core is C code, the number of language bindings available is staggering. As my SQLite introduction, I developed a simple .NET 2.0 assembly that uses SQLite to aggregrate Internet Explorer favorites from all of my PCs. The supplied command-line administration tool provides the capability to quickly generate a consumable html file of aggregated links. What is nice is that I can use the power of SQL to filter the aggregated links by title, date, grouping, etc. No surprise that Mozilla's Firefox 3 is moving to SQLite for storage of their bookmarks among other things.
One of the distinctive SQLite features that did lead to a data defect was column datatype affinity. While traditional databases use static datatyping on columns, SQLite uses the column datatype as a recommendation. In other words, you can store any value of any datatype into any column (except a column that specifies INTEGER PRIMARY KEY). I had improperly formatted the creationTime attribute for insertion which created a problem when attempting to use it in an ORDER BY later on. Besides column type affinity, it's also important to keep in mind that SQLite does not enforce RI and complicates table evolution with limited ALTER TABLE options.
It was extremely simple, and deserves a closer look for use in other endeavors. You can read the source listing in html here.
]]>I started out with the question of how can I quickly assess data quality in 300 of my production Microsoft SQL Server databases, at any time, using some of Scott’s regression testing criteria? I wanted these assessment tests to be close to the database itself, a simple table with a few stored procedures. I started out simple for this post by testing column default values. The first thing I needed was a Stored Procedure, ExecLiteral, that would automatically execute intermediate results similar to what sp_execresultset did in Microsoft SQL Server 2000. Next I needed to generate the reference xml data required to validate column default values and column value existence for future tests in other databases. The idea was a simple stored procedure that used the INFORMATION_SCHEMA.COLUMNS view and FOR XML EXPLICIT to generate the xml that I needed to validate the column defaults, and test the existing column data for value existence. In future posts, we may evolve this to include actual default value INSERT tests. The Stored Procedure that identifies missing defaults uses the new T-SQL EXCEPT operator to check for missing default schema values. Then ExecLiteral is invoked to test for the existence of a value in a column with a specified default.
Let’s see where this takes us. You can read the source listing in html, and text.
]]>