Archive for July, 2009

Chapter 5 ADVANCED DATA RETRIEVAL AND MODIFICATION 311

Friday, July 31st, 2009

Chapter 5 ADVANCED DATA RETRIEVAL AND MODIFICATION 311 Typically, if you re adding more than 50% of the current table count or more into the table, you should drop the indexes first because they slow down the inserts, and the indexes will be better if you rebuild them after adding that much data anyway. Regardless, fast bulk copy doesn t work if there are indexes on the table unless the table is empty to begin with. Another option for large bulk inserts sets the batch size. Batch size is the number of rows that will be inserted as part of a transaction. If the batch size is large, then the transaction that is generated will be large, and it may cause your transaction log to fill up. If the transaction is too small, SQL Server spends too much time committing transactions rather than writing your data, and performance suffers. Typically, a batch size between 1,000 and 10,000 is used. Files with lots of rows and very few columns tend to benefit from higher batch sizes. By default, BCP does the entire operation in one batch, but it lets you know when it finishes sending each 1,000 rows to SQL Server. You should be aware of a few special options, as shown in Table 5.3, that are used for importing data into SQL Server. TABLE 5.3 BCP DATA IMPORT PARAMETERS Parameter Function -k Tells SQL Server that if some of the data coming in has nulls in it, it shouldn t apply the default values; it should just leave the column null. -E If the table being imported into has an identity column, this option tells SQL Server to use the values in the file rather than the automatically created values from the IDENTITY property. -R Tells BCP to use the regional time, date, and currency settings rather than the default, which is to ignore any regional settings. -b Gives batch size, number of rows in each batch. Defaults to all of the rows in one batch. -h Gives Bulk Insert Hints (see Table 5.4) The -h option enables you to specify one or more different hints to SQL Server about how to process the bulk copy. These options enable you to fine-tune BCP performance, and are listed in Table 5.4.

If you looking for unlimited one inclusive web hosting plan please check unlimited web hosting website.

NOTE 310 Part I EXAM PREPARATION So far,

Thursday, July 30th, 2009

NOTE 310 Part I EXAM PREPARATION So far, all the examples have involved exporting data from SQL Server. Now it s time to take a look at importing data. The BCP command works the same both ways: you should specify IN instead of OUT to import data into SQL Server. For large files, with more than a couple thousand rows perhaps, you should turn on the Select Into/Bulkcopy option for the database, or set the database recovery mode to BULK_LOGGED or SIMPLE. These options disable all transaction log backups while they are turned on, and you must do a full backup to get transaction log backups to work afterwards. What the option does is for certain operations, namely those involving SELECT INTO and BULK COPY; it changes how transaction logging works. Typically, whenever you insert a row, SQL Server logs the row being What About SELECT INTO/BULKCOPY? inserted into the transaction log. This prevents data loss in case of If you re used to using SQL Server 7.0 power outage and enables you to do point-in-time database recovery. or previous versions, you re probably This also significantly slows down the process of inserting huge wondering what happened to the numbers of records. Switching to BULK_LOGGED or SIMPLE changes the SELECT INTO/BULKCOPY option. It s behavior so that rather than logging the entire row insert, SQL been replaced by a Recover Mode Server just logs the page allocations, which involves a lot less over- option. You can choose one of three head. Basically, when you do a BCP and the database is set for recovery modes: FULL, BULK_LOGGED, BULK_LOGGED or SIMPLE recovery, all the data goes into allocated space or SIMPLE. FULL mode is the default for everything except Desktop and the in the database; and when the copy commits, it attaches the allo- Data Engine versions of SQL Server. cated space to the table. It s really fast, and it s still very safe because BULK_LOGGED is similar to the old all the page allocations are logged, and if the transaction fails and SELECT INTO/BULKCOPY option, in that has to roll back, the pages are deallocated. This process is called Fast any bulk row operations have only Bulk Copy. allocations logged, not the data. FULL In addition to having the BULK LOGGED or SIMPLE recovery option mode is the normal mode for most operations; it offers the widest variety selected, you need to do a few other things to get fast bulk copy of recovery options. to work. The target table can t be involved in replication. The target table can t have any triggers. The target table either has zero rows or has no indexes. The TABLOCK hint is specified. This is covered in more detail later in the section; for now, the TABLOCK hint is another parameter you can give BCP to make it acquire a table lock before it begins writing data.

For high quality java hosting services please check tomcat web hosting website.

Chapter 5 ADVANCED DATA RETRIEVAL AND MODIFICATION 309

Thursday, July 30th, 2009

Chapter 5 ADVANCED DATA RETRIEVAL AND MODIFICATION 309 the actual 4-byte integer for the value 42, not the 2-byte character for 42. Here s an example of a session to create a character-based file: C:Documents and SettingsMILLCS>bcp Chapter5..Sales out .saleschar.dat -T Enter the file storage type of field PersonID [int-null]: char Enter prefix-length of field PersonID [1]: 0 Enter length of field PersonID [12]: Enter field terminator [none]: Enter the file storage type of field ProductID [int-null]: char Enter prefix-length of field ProductID [1]: 0 Enter length of field ProductID [12]: Enter field terminator [none]: Enter the file storage type of field QtyPurchased [int-null]: char Enter prefix-length of field QtyPurchased [1]: 0 Enter length of field QtyPurchased [12]: 12 Enter field terminator [none]: Enter the file storage type of field DatePurchased [datetime-null]: char Enter prefix-length of field DatePurchased [1]: 0 Enter length of field DatePurchased [26]: Enter field terminator [none]: n Do you want to save this format information in a file? [Y/n] y Host filename [bcp.fmt]: bcpchar.fmt Starting copy… 8 rows copied. Network packet size (bytes): 4096 Clock Time (ms.): total 1 This creates the same output as if you d specified just -c on the BCP command line. Notice that the field storage type and prefix length had to be changed for each row, and the last row had to have a field terminator of n. The n puts each record on a new line. The resulting output file is a nice, column-delimited file: C:Documents and SettingsMILLCS>type saleschar.dat 1 37 4 2001-07-22 16:50:38.257 1 38 3 2001-07-22 16:50:38.257 3 39 1 2001-07-22 16:50:38.257 4 51 1 2001-07-22 16:50:38.257 4 47 1 2001-07-22 16:50:38.257 9 37 10 2001-07-22 16:50:38.257 9 38 5 2001-07-22 16:50:38.257 10 41 6 2001-07-22 17:53:51.793

For high quality website hosting services please check java web hosting website.

308 Par t I EXAM PREPARATION 8 rows

Wednesday, July 29th, 2009

308 Par t I EXAM PREPARATION 8 rows copied. Network packet size (bytes): 4096 Clock Time (ms.): total 1 In this example, all the defaults were used by just pressing the Enter key. This session of BCP results in two files being created: one is the output file, and the other is the format file, which in this case was named bcp.fmt. The data file that is created is the same you d get if you d used the -n for native format. Here s a format file: 8.0 4 1 SQLINT 1 4 1 PersonID 2 SQLINT 1 4 2 ProductID 3 SQLINT 1 4 3 QtyPurchased 4 SQLDATETIME 1 8 4 DatePurchased The first row of the format file is the version number of BCP. (If you want to see just the version number, by the way, you can use BCP -v at the command line.) The second row is the number of data rows that are in the file. The third row on to the end of the file is the actual layout of the file. The first column is the file column number. The second column is the data type. The third column is the prefix length, which is the number of bytes in the file that tell BCP how long the data field is, and is used only in native-format BCP. The fourth column is the number of bytes wide the data column is. The fifth column is the delimiter, which is what separates this column from the next column. Next is the server column order. Finally, the row ends up with the field name and the collation for the column. The file column number and server column number fields are used to do a couple of interesting things. First of all, if the table has columns that are in a different order than the file, you can manipulate the server column number to make it correct. Second, if you set the server column number to zero, then the column from the file gets skipped. Prefix length is used when copying data in SQL Server native mode. If the format file weren t native mode, then it would have SQLCHAR as the type for each column, rather than SQLINT or SQLDATETIME. The type is the type that is written (or read) from the file, not the database type. So if the type is SQLINT, then BCP is going to write out

If you looking for unlimited one inclusive web hosting plan please check unlimited web hosting website.

Chapter 5 ADVANCED DATA RETRIEVAL AND MODIFICATION 307

Tuesday, July 28th, 2009

Chapter 5 ADVANCED DATA RETRIEVAL AND MODIFICATION 307 Parameter Function -N Specifies a native mode that is slower than the -n, but that doesn t destroy Unicode. -w Specifies a native mode that s even slower than -N, because it also specifies a tab-delimited file with a newline character. The other native modes can store numbers in binary format, so they are faster. -V Tells BCP to use one of the old SQL Server versions data types for import and export. For an export, this also translates null bit fields to zero, because previous versions didn t handle that. This has no effect on date fields, which are always copied out however ODBC wants to do it. Notice that there is a command-line option -n and another one that s -N. All BCP command-line options are case sensitive. If you don t specify one of -c, -f, -n, -N, or -w, BCP assumes that you don t have a format file, you don t want to use any of the predefined formats, and that you want to make one. It reads the layout of the table you re using and then walks you through a prompted one- column-at-a-time process, and then you can save the file as a format file. Here s an example of a BCP session where BCP is prompting for information: C:Documents and SettingsMILLCS>bcp Chapter5..Sales out .sales.dat -T Enter the file storage type of field PersonID [int-null]: Enter prefix-length of field PersonID [1]: Enter field terminator [none]: Enter the file storage type of field ProductID [int-null]: Enter prefix-length of field ProductID [1]: Enter field terminator [none]: Enter the file storage type of field QtyPurchased [int-null]: Enter prefix-length of field QtyPurchased [1]: Enter field terminator [none]: Enter the file storage type of field DatePurchased [datetime-null]: Enter prefix-length of field DatePurchased [1]: Enter field terminator [none]: Do you want to save this format information in a file? [Y/n] Host filename [bcp.fmt]: Starting copy…

For high quality website hosting services please check cheap web hosting website.

306 Part I EXAM PREPARATION BCP Speed and

Monday, July 27th, 2009

306 Part I EXAM PREPARATION BCP Speed and File Format What makes one method of BCP faster than another? Native format files are the smallest; character-delimited files are the next smallest; and column- delimited files are the largest. BCP is so fast and well optimized that it is bound by how fast it can read or write data to or from the file. NOTE also the fastest format, so use it whenever you can. The second fastest format is character-delimited. Character-delimited formats use some character, typically a comma, space, or a vertical bar, to separate the data columns. Finally, there is column-delimited data, which means the columns within the data file start and end at specific positions in the file. This is also called fixed column width or just fixed column data format. This tends to be the slowest way to BCP data around. BCP does not create tables. You have to have a table set up and waiting for BCP before you run BCP. So, how do you run BCP? Here s an example of reading data from a comma-delimited text file into a database table: bcp chapter5.dbo.sales in sales.csv -T -c -rn -t, The -T tells BCP to use a trusted connection. There is no server specified; it would be specified with the -S option, so the data goes to the local server. The -c tells BCP that it s supposed to use a character-delimited copy; the -r says that each row will be delimited with a newline character; and the -t says that each column will be delimited with a comma. BCP has a bunch of command-line parameters. Table 5.3 lists the ones that are used for determining the file format. TABLE 5.3 BCP COMMAND-LINE PARAMETERS FORMAT PARAMETERS Parameter Function -c Specifies a character-delimited file is to be used. -r Specifies the end-of-line character. Usually this is specified as r n, which specifies that there is a new line at the end of each line. -t Specifies the end-of-field character, typically a comma, vertical bar, or sometimes a space. Commas can be specified as -t, but any delimiter can be specified in double-quotes, such as -t | -f Specifies a format file to use. This is typically used to handle delimited data. -n Specifies that BCP should use native mode for copying. This parameter will copy all normal character data and non-character data okay, but will destroy any Unicode values.

For high quality website hosting services please check tomcat web hosting website.

Chapter 5 ADVANCED DATA RETRIEVAL AND MODIFICATION 305

Monday, July 27th, 2009

Chapter 5 ADVANCED DATA RETRIEVAL AND MODIFICATION 305 Importing and Exporting Data with BCP The first thing to understand about BCP, the Bulk Copy Program, is that it s not a SQL Server command. It s not part of T-SQL. If you attempt to use BCP in Query Analyzer, it does everything it can to just laugh at you. Don t do it; it doesn t work. BCP is a command- line tool. That s right: the big, black empty window with the blinking cursor command line. So fire up a command prompt and dig in. BCP is ancient, in computer years anyway. It s part of the wild history of SQL Server, and has been part of SQL Server since at least version 4.21, back when it was still a joint development effort between Microsoft and Sybase. The reason it s still around is that it s an extremely useful tool for loading data into a database quickly. The reason it s a command-line tool is all about overhead. Keep in mind that you can run BCP across the network; it doesn t have to run on a server. You can have a bunch of servers all across your network using BCP at once, and, assuming you have enough disk speed, SQL Server just sits there and soaks up data. BCP has lots of command-line options. The basic syntax is: bcp

.
The
is the destination table, usually specified as a three-part name, like Chapter5.dbo.sales. The is what direction. Telling BCP to go IN tells BCP to read from the file and put data IN SQL Server. Telling BCP to go OUT pulls data OUT of SQL Server and writes it to a file. The is the name of the file that you want to use. If it s an IN operation, then the file should exist and have data in it. If it s an OUT operation and the file exists, the file gets overwritten by the data coming out; otherwise, the file is created. The is the name of the server that you re trying to use, and either username and password or a note to use your Windows authentication to handle it. Finally,
tells BCP what kind of format the data is in. BCP can deal with three data formats: native, character- delimited, and column-delimited. Native format works only when you re moving data from one SQL Server to another, and the servers have to use the same collation and character set for it to work. It s NOTE Bee Sea Pea If you re looking for help with BCP, or just trying to find articles on Microsoft s support web site, you can just look for the acronym BCP. It s one of those so common people usually don t remember what the letters mean acronyms.

If you looking for unlimited one inclusive web hosting plan please check unlimited web hosting website.

304 Part I EXAM PREPARATION REVIEW BREAK XML

Sunday, July 26th, 2009

304 Part I EXAM PREPARATION REVIEW BREAK XML in Review XML is a feature that is very well covered by the exam, so be sure to use the examples here and get a good understanding of OPENXML and how the path syntax works. . XML is a document format that can be used to transfer hierarchical data between systems. . XML documents can be created with the FOR XML clause of the SELECT statement. This clause provides several options for formatting XML output. . You can create rowsets from XML documents using the OPENXML statement. Spend some time typing in the examples from this chapter, or use the slightly more complex examples found in blogs Online to get a complete idea of how the OPENXML and FOR XML ideas really work. IMPORTING AND EXPORTING DATA . Import and export data. Methods include the bulk copy program, the Bulk Insert Task, and Data Transformation Services (DTS). Outside of XML-land, moving data around is pretty simple. There are three major ways to do import/export tasks in SQL Server. They all have strengths and weaknesses. The Bulk Copy Program (BCP) is probably the hardest to learn, but it is also extremely capable and almost ludicrously fast. The BULK INSERT statement implements part of BCP inside SQL Server, so it has all the speed of BCP with an easier-to-use interface. Finally, the Data Transformation Services, or DTS, provide a lot of flexibility and capabilities in a very graphically intensive, point-and-click environment.

For reliable and cheap web hosting services please check cheap web hosting website.

Chapter 5 ADVANCED DATA RETRIEVAL AND MODIFICATION 303

Saturday, July 25th, 2009

Chapter 5 ADVANCED DATA RETRIEVAL AND MODIFICATION 303 You can enter the following if you want to use the rowpattern /Person/Sales to return sales information, and show the first name and last name for each sale: SELECT * FROM openxml(@hdoc, /Person/Sales , 1) WITH (FName varchar(30) ../@FirstName , LName varchar(30) ../@LastName , QtyPurchased int @QtyPurchased ) Notice the use of the .. syntax in specifying the output of the XML file. This SELECT statement uses the /Person/Sales as the default level, so any value that exists at that level can be specified by just the name of the value. Here s the output: FName LName QtyPurchased Shelly Alexander 5 Shelly Alexander 10 Anything above or below the default level, /Person/Sales, has to be qualified with a path, which is what the .. syntax represents: a path representing the level above the default, in this case /Person. Here s another example: select * from openxml(@hdoc, /Person , 1) WITH (FName varchar(30) @FirstName , LName varchar(30) @LastName , QtyPurchased int Sales/@QtyPurchased ) TIPPathing XML Pathing XML is very likely to be on your exam because it s important to understand how it works if you re going to use OPENXML(), and because Microsoft is very proud of the new XML features in SQL Server 2000. EXAM This shows a different syntax, and provides a different result. The preceding example returned every person and sale. This example returns only the first sale: FName LName QtyPurchased Shelly Alexander 5 It returns only the first sale because it s returning one row for each default level, which is the /Person level in this case. So, now you know how to export data to XML format, which is a pretty useful thing. You also should have a good handle on how to translate data from XML into a rowset, which is marginally useful. So now it s time to move real data in and out of SQL Server.

For high quality java hosting services please check java web hosting website.

302 Part I EXAM PREPARATION So, now you

Saturday, July 25th, 2009

302 Part I EXAM PREPARATION So, now you have the data you want, extracted from an XML rowset. Using the WITH clause is basically the same syntax as laying out the columns in a table: the column name, some space, the data type, a comma, and then the next column name. IN THE FIELD XML PARSERS If the OPENXML stuff looks like it s extremely cumbersome to deal with, there s a good reason for it. It is extremely cumbersome. You can t read the XML in from a file very easily; you have to spend a huge amount of time fighting with arcane bit-field flags, and you get to completely reformat your data using a WITH option. And you do all that just to get a few rows out, because you can t declare a variable of type TEXT, so you can hold only about 8KB of XML in SQL Server at a time. Learn this stuff for the test. If you are ever involved in a project that requires you to import XML, use any of about five readily available scripting languages (Perl, VBScript, Java, Python, and C# come to mind, and there are probably dozens more), parse the XML using the already written, elegant, and useful tools in those languages tools that are specifically designed to parse XML and have the scripts write out nice, comma- (or something) delimited text. You ll learn how to import that in the next section. That way, you don t have to worry about getting memory leaks in SQL Server, and you don t need to be concerned about running this a thousand times to get all your data in 8KB at a time. Well, the syntax stays the same until you decide you don t want a column called FirstName; you d rather have columns called FName and LName. So, now what? SELECT * FROM openxml(@hdoc, /Person , 1) WITH (FName varchar(30) @FirstName , LName varchar(30) @LastName ) Now you end up with a nice, clean rowset. You need to look at one more thing for the exam. Imagine that you want to output something farther up the tree from where you specify the rowpattern.

If you looking for unlimited one inclusive web hosting plan please check unlimited web hosting website.