Where is the Normal Style Font defined in an XLSX file when file uses multiple fonts

Chazg 21 Reputation points
2022-11-30T14:18:56.97+00:00

Hi,

I am working on a project that needs to extract data from the internal xml files stored within an xlsx file.

The part I am currently working on involves trying to extract the xlsx files "Normal font" information. As I understand it, this information is stored in the internal file xl/styles.xml.

This information is easy to extract when the file/sheet contains a single font. However I don't know how to extract the correct "Normal font" info if the xl/styles.xml file contains multiple font definitions (i.e. when the <fonts> tag contains more than one <font> child element).

I originally thought the "Normal font" was always the first font defined but have found xlsx files where this is not the case.

So, my question: How can I extract the "Normal font" information for the xlsx file from the internal file xl/styles.xml when the <fonts> tag contains more than one <font> child ?

P.S. I have found the following related question on this forum:
https://social.msdn.microsoft.com/Forums/SECURITY/en-US/0af5e410-493a-4478-8b84-5b7d92755561/where-is-the-normal-style-font-defined-in-an-xlsx-file?forum=os_binaryfile

However, it does not explain how to find the "Normal font" when multiple fonts are defined for the xlsx file (in xl/styles.xml)

Many thanks in advance.

Office Open Specifications
Office Open Specifications
Office: A suite of Microsoft productivity software that supports common business tasks, including word processing, email, presentations, and data management and analysis.Open Specifications: Technical documents for protocols, computer languages, standards support, and data portability. The goal with Open Specifications is to help developers open new opportunities to interoperate with Windows, SQL, Office, and SharePoint.
127 questions
{count} votes

Accepted answer
  1. Tom Jebo 1,991 Reputation points Microsoft Employee
    2022-12-01T02:33:16.657+00:00

    Hi @Chazg ,

    You should take a look at ISO 29500-1:2016 where styles are defined for SpreadsheetML and workbook packages created by Excel. The standard can be downloaded here:

    ISO/IEC 29500-1:2016

    In section 18.8.7 cellStyle (Cell Style), you will find attributes on the cellStyle tags used in the xl/styles.xml part. In particular xfId is used to identify which formatting record is used by a given named style like "Normal". This xf formatting record in the cellStylesXfs block will contain a fontId attribute that specifies the index into the fonts block of the correct font to be used.

    For example, you will see something like this in xl/styles.xml:

    <styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac x16r2 xr" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:x16r2="http://schemas.microsoft.com/office/spreadsheetml/2015/02/main" xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision">  
        <fonts count="2" x14ac:knownFonts="1">  
            <font>  
                <sz val="11"/>  
                <color theme="1"/>  
                <name val="Calibri"/>  
                <family val="2"/>  
                <scheme val="minor"/>  
            </font>  
            <font>  
                <sz val="11"/>  
                <color theme="1"/>  
                <name val="Envy Code R"/>  
                <family val="3"/>  
            </font>  
        </fonts>  
        <cellStyleXfs count="1">  
            <xf numFmtId="0" fontId="0" fillId="0" borderId="0"/>  
        </cellStyleXfs>  
        <cellXfs count="2">  
            <xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0"/>  
            <xf numFmtId="0" fontId="1" fillId="0" borderId="0" xfId="0" applyFont="1"/>  
        </cellXfs>  
        <cellStyles count="1">  
            <cellStyle name="Normal" xfId="0" builtinId="0"/>  
        </cellStyles>  
    

    In this example, the first font (line 3 above) is index 0 in the font list and specifies Calibri.

    Normal (line 25) specifies an xfId of 0 which points to the first and only indexed xf record in cellStyleXfs (line 18). Normal is a built-in style which means it will be specified and have the attribute builtinId set to 0 (which is for Normal). If the author of the workbook modifies the Normal style, it will have the customBuiltin attribute set to 1 for true like this:

    <cellStyle name="Normal" xfId="0" builtinId="0" customBuiltin="1"/>  
    

    I hope this helps, let me know if you have more questions.

    Best regards,
    Tom Jebo
    Microsoft Open Specifications Support

    1 person found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Chazg 21 Reputation points
    2022-12-01T08:37:29.097+00:00

    Hi @Tom Jebo ,

    Many thanks for your reply.

    If I understand correctly, the following procedure needs to be followed in order to identify the 'Normal font' of an xlsx/xlsm/etc. file:

    STEP 1:
    Open the internal xml file xl/styles.xml

    STEP 2:
    Locate the <cellStyles> tag and find the child tag <cellStyle> that contains the attribute builtinId="0". For this child tag make a note of the value of attribute "xfId".

    STEP 3:
    Now locate the <cellStyleXfs> tag. Find the nth child tag <cellStyleXf> where n is the value of the "xfId" attribute found in STEP 2 (n is zero based). For this child tag make
    a note of the value of the attribute "fontId"

    STEP 4:
    Now locate the <fonts> tag. Find the mth child tag <font> where m is the value of the "fontId" attribute found in STEP 3 (m is, again, zero based).
    This <font> tag information is used by the xlsx/xlsm/etc. file for the "Normal font".

    Is this correct ?