Thursday, July 23, 2009

Long time no post...

... Its almost a year since i posted here.

Lots of things learnt about the OOXML SDK Version 2.

Will post how to;s one by one.

Tuesday, September 16, 2008


Wow! I am amazed at the color coding used in VML!
If you see such a color... dont be surprised! Its actually a short form for
color="#993300", which happens to be an indexed color, with index 60 (if you dont know what the indexed color pallette is... will post in the next blog)...

Gosh! Why such a style? Does any one know? It looks absurd to me now... may be later on as I work with this I will understand why Microsoft wants to the touch the nose from the other side.


Monday, September 15, 2008

Active X Class IDs

The list of Active X Class IDs which I mentioned in my previos blog are here:

Toggle button -- "{8BD21D60-EC42-11CE-9E0D-00AA006002F3}";
Image -- "{4C599241-6926-101B-9992-00000B65C6F9}";
Label -- "{978C9E23-D4B0-11CE-BF2D-00AA003F40D0}";
Radio button -- "{8BD21D50-EC42-11CE-9E0D-00AA006002F3}";
Spin button -- "{79176FB0-B7F2-11CE-97EF-00AA006D2776}";
Scrollbar -- "{DFD181E0-5E2F-11CE-A449-00AA004A803D}";
Text box -- "{8BD21D10-EC42-11CE-9E0D-00AA006002F3}";
Listbox -- "{8BD21D20-EC42-11CE-9E0D-00AA006002F3}";
Checkbox -- "{8BD21D40-EC42-11CE-9E0D-00AA006002F3}";
Combo Box -- "{8BD21D30-EC42-11CE-9E0D-00AA006002F3}";
Button -- "{D7053240-CE69-11CD-A777-00DD01143C57}";


Friday, September 12, 2008

How to determine what kind of ActiveX components are embedded in the Sheet?

If you have a Form control listbox and an ActiveX listbox... here's how they get encoded in the vmlDrawing1.vml document for that sheet.

Now... the first shape's ClientData says that this is an object of type List box.. this signfies the form control... the second shape's ClientData says that this is an object of type "Pict"... and this actually signifies the ActiveX Listbox control...

now if you embed an activeX checkbox control in the sheet, it will also generate a shape and a clientData with object type = "pict" .... in such a situation... how will you distinguish which pict is a listbox and a checkbox...

There's only one way you can do that... No.. not by running through the child nodes of ClientData and determining with the combination of child controls... you can achieve by doing so, but its not recommended.
The reliable way to do that is to retrieve the ActiveX?.xml of the control embedded in that sheet. Where are the ActiveX?.xml files stored?
They are stored in the xl/ActiveX folder of the Xlsx package.

How do we know which ActiveX?.xml is encoding what control in the sheeet...

There is a section "controls" in the sheet?.xml which has the relationship Ids and names of those controls in it.

Now by iterating on the sheet?'s relationships... you can get the activeX?.xml parts easily.. using the sheet1.xml.rels

Now the activeX1.xml contents will look like this.

note the attirbute ax:classId of the ax:ocx node... the value of it is a string

the first part of the class id is "8BD21D40"... this represents a listbox...
if the first part of the class id is "8BD21D20" then it represents a checkbox... and so on... (I will post the full list of class ids, for other contorls after I do some research)

for the rest of the properties of the listbox, you can use their corresponding shape elements in the VMLdrawing1.vml for the sheet... how do you get those shapes?? Using the shape Id mentioned in the Control element for the activeX control, in the Controls section of the sheet?.xml The Shape Id mentioned in this section will not match exactly to the v:shape's shapeid.... you have to find the shape by checking which v:shape's shapeid ends with the shapeId mentioned in the controls section.

So to read an activeX form control embedded in the sheet, you have to first determine what type of control it is using the activeX?.xml's classid, and then the properties of that control using the shape Id, and retrieve that shape from the sheet's vml document using the shape Id...


NOTE: The XML sections have not been posted in the blog... because the blog construes them as HTML... will convert them to plain text and update the blog soon.

Tuesday, September 9, 2008

Open XML SDK V2.0 - How to read an Excel 2007 using Open XML Format SDK V2.0

Here's a quickie on navigating to a cell in an Excel 2007 XML document, using the Open XML Format SDK verison 2.0 released recently...

SpreadsheetDocument ssdocCellStyleTest = SpreadsheetDocument.Open("C:\\XML2\\BorderTest.xlsx", false);
Workbook workbook = ssdocCellStyleTest.WorkbookPart.Workbook;
WorksheetPart sheetpart;
foreach (WorksheetPart worksheetpart in workbook.WorkbookPart.WorksheetParts)
Worksheet worksheet = worksheetpart.Worksheet;
foreach (SheetData sheetData in worksheet.Elements())
foreach (Row row in sheetData.Elements())
MessageBox.Show("row index is " + row.RowIndex);
foreach (Cell cell in row.Elements())
MessageBox.Show("cell index " + cell.CellReference);

Stylesheet styleSheet = workbook.WorkbookPart.WorkbookStylesPart.Stylesheet;
foreach (CellFormat cellFormat in styleSheet.CellFormats.Elements())
MessageBox.Show("Cell Style ID " + cellFormat.FormatId);
MessageBox.Show("Cell has border " + cellFormat.ApplyBorder);
MessageBox.Show("Cell has fill " + cellFormat.FillId);
DocumentFormat.OpenXml.Spreadsheet.Fonts oFonts = styleSheet.Fonts;
DocumentFormat.OpenXml.Spreadsheet.Font oFont = (DocumentFormat.OpenXml.Spreadsheet.Font)oFonts.ChildElements[int.Parse(cellFormat.FontId.ToString())];
foreach (FontName oFontName in oFont.Elements())
foreach (OpenXmlAttribute attrib in oFontName.GetAttributes())
MessageBox.Show("Font attrib " + attrib.LocalName + " value " + attrib.Value);

So ... as you see every element of the SpreadsheetML language... is now an OpenXmlElement class, or either an OpenXmlLeafElement class... you are familiar with WorkbookPart, ThemePart etc.. now each part comes with an object like workbookPart has Workbook object, ThemePart has Theme object, StyleSheetPart has a stylesheet object.. which is a wrapper of the data in the xml stream of its corresponding part.
ITs stilll not that robust... as you can see... given the Font id and the stylesheet.Fonts collection object retrieving the font attributes is still difficult, because even b, u, i, name elements have classes Bold, Italic, Underline, FontName etc. (NOTE: Dont confuse this with System.Drawing objects, they are part of the DocumentFormat.OpenXml.Spreadsheet namespace)

However the theme object is a good wrapper. For example theme.themeElements()[0].ColorScheme.AccentColor.RGBColorInHex gives the rgb value of the theme color instantly.. we dont have navigate as we did in Fonts....

So its an inconsistant new SDK... there is no uniformity.... because DocumentFormat.OpenXml.Spreadsheet.Font is different from DocumentFormat.OpenXML.WordProcessing.Font....

Have to explore more on VML drawing extraction usig this new SDK...

Happy excelling!


Thursday, August 21, 2008

Removed Part: /xl/styles.xml error - (While manually manipulating styles of an XLSX document)

I noticed that the "numFmts" block has to be placed as the first child of the styleSheet Document else you will get an error like this "Removed Part: /xl/styles.xml part with XML error. (Styles) Load error. Line 31, column 19."

The error vanished when I placed the numFmts block as the first child of the styleSheet node in the styles.xml file.

-Kalpana (21.08.2008)