SheetJS Development Notes

(works best in Chrome, iOS7 Safari, IE11 and OSX Safari)
Follow us on Twitter @SheetJS
Using the file <a id="filename">somefile</a>. Supported file formats are listed below. A very simple grid demo is available at <a href="http://oss.sheetjs.com">http://oss.sheetjs.com</a>. This demo goes into more detail. <div class="info">Text in blue is dynamically generated after parsing the file in your browser</div> `Code snippets are shown in red` # sexql (the cool demo) The obligatory "cool demo" converts your Excel file into a database and lets you run [SQLite-compatible](https://www.sqlite.org/lang_select.html) SQL select statements (powered by [sql.js](https://github.com/kripken/sql.js)). `WARNING: sql.js is massive (2MB)`: <label>Enter an SQL select statement: &nbsp; &nbsp;</label><input type="text" id="sql" class="winpt"/> <pre id="sqlpre">Notes: </pre> Output: <div id="sqldiv"></div> DB Creation Statements: <pre id="sexqls"></pre> # The Basics ### Sheet names <pre id="sheetnames" class="info">Text replaced with the sheet names</pre> Getting sheet names: `workbook.SheetNames` ### Selecting individual values Picking individual cells is easy <label>Type in a cell address ([A1-style](http://office.microsoft.com/en-us/excel-help/about-cell-and-range-references-HP005198323.aspx?CTT=1) ): &nbsp; &nbsp;</label><input type="text" id="cell" /> <label>Override number format ([Excel-style](http://office.microsoft.com/en-us/excel-help/create-a-custom-number-format-HP010342372.aspx?CTT=1#BM1)): &nbsp; &nbsp;</label><input type="text" id="fmt" /> <pre id="cellval">Value: </pre> Get cell object: `cell = workbook.Sheets[sheet_name][cell_address_in_A1_format]` Get cell value: `cell.v` Get cell formatted text: `cell.w` For more details, jump down to "Common Spreadsheet Format" section ### Table range <pre id="sheetranges" class="info"></pre> Getting sheet range: `workbook.Sheets[sheet_name]['!ref']` Extracting start and end: `CSF.utils.decode_range(workbook.Sheets[sheet_name]['!ref'])` ### Manually Rendering a Dumb Table Once you know the bounds, you can set up the scaffold manually: <div id="dumbtable" class="info"></div> Encoding a cell reference: `CSF.utils.encode_cell({c:desired_col, r:desired_row})` ### Hopping row by row The sheet event makes the process slightly easier by generating row objects. Since the row objects are JSON, the order is lost. The sheet event also passes a column headers array to preserve order <pre id="sheetcols" class="info"></pre> JSON: `arguments[0]` in the sheet callback Sheet columns: `arguments[1]` in the sheet callback ### Another table This table is built by iterating over the rows. Compare with <http://sheetjs.com/status/>: <div id="mtout"></div> # Using Dropsheet in your code ### Setting up the object ``` var DS = DropSheet(dopts); ``` `dopts` is expected to be an object specifying event handlers: ``` { errors: { badfile: /* file is not actually a spreadsheet */ pending: /* second workbook requested while first is processing */ failed: /* encountered an error */ large: /* give an opportunity to stop if file is too large */ }, on: { wb: /* first callback with workbook */ workstart: /* perfect time to start a spinner */ workend: /* perfect time to stop a spinner */ sheet: /* callback with processed worksheet data */ } } ``` ### Getting the file There are many techniques. Here's one: ``` var oReq = new XMLHttpRequest(); /* yes, XHR */ oReq.open("GET", '/' + file, true); oReq.responseType = "arraybuffer"; oReq.onload = function(e) { /* For some unknown reason, Chrome slows to a crawl with typed arrays */ var arraybuffer = oReq.response; var data = new Uint8Array(arraybuffer); var arr = new Array(); for(var i = 0; i != data.length; ++i) arr[i] = data[i]; DS(arr); /* Drop it like it's hot */ } ``` ### Binding to an element for drops Consult <http://oss.sheetjs.com> for an example # Common Spreadsheet Format ### Workbook `workbook.SheetNames` is an ordered list of the sheets in the workbook `workbook.Sheets[sheetname]` returns a data structure representing the sheet. Each key that does not start with `!` corresponds to a cell (using `A-1` notation). `workbook.Sheets[sheetname][address]` returns the specified cell. ### Cell fields - `.v` : the raw value of the cell - `.w` : the formatted text of the cell (if applicable) - `.t` : the type of the cell (constrained to the enumeration `ST_CellType` as documented in page 4215 of ISO/IEC 29500-1:2012(E) ) - `.f` : the formula of the cell (if applicable) - `.r` : the rich text encoding of a cell text (if applicable) - `.h` : an HTML rendering of the rich text (if applicable) - `.c` : comments associated with the cell - `.z` : the number format string associated with the cell (if requested) # sexql (how it is done) The SQL part is handled by sql.js. All we need to do is determine data types, generate CREATE TABLE SQL statements and insert data. ### Determining Data Types Excel treats all numbers as IEEE-754 doubles (just like JS Number type). It does not try to distinguish integers. To keep things simple, we resolve data types as "TEXT" or "REAL" to make SQLite happy: ``` var types = new Array(range.e.c-range.s.c+1); var names = new Array(range.e.c-range.s.c+1); /* find range */ var range = CSF.utils.decode_range(ws['!ref']); var R = range.s.r; /* determine column names */ for(var C = range.s.c; C<= range.e.c; ++C) names[C-range.s.c] = (ws[CSF.utils.encode_cell({c:C,r:R})]||{}).w; /* loop through cells to determine types */ for(var C = range.s.c; C<= range.e.c; ++C) for(R = range.s.r+1; R<= range.e.r; ++R) switch((ws[CSF.utils.encode_cell({c:C,r:R})]||{}).t) { case 'e': break; /* error type */ case 'b': /* boolean -> number */ case 'n': if(types[C-range.s.c] !== "TEXT") types[C-range.s.c] = "REAL"; break; case 's': case 'str': types[C-range.s.c] = "TEXT"; default: break; /* if the cell doesnt exist */ } ``` ### Generating CREATE TABLE statements No rocket science here (wrap names in backticks, in case a column name has spaces): ``` stmt("CREATE TABLE `" + sname + "` (" + names.map(function(n, i) { return "`" + n + "` " + (types[i]||"TEXT"); }).join(", ") + ");" ); ``` ### Inserting data We generate a query for each row. Remember that `.w` refers to formatted text and `.v` is the numeric value for number cells: ``` for(R = range.s.r+1; R<= range.e.r; ++R) { var fields = [], values = []; for(var C = range.s.c; C<= range.e.c; ++C) { var cell = ws[CSF.utils.encode_cell({c:C,r:R})]; if(!cell) continue; fields.push("`" + names[C-range.s.c] + "`"); values.push(types[C-range.s.c] === "REAL" ? cell.v : '"' + cell.w + '"'); } stmt("INSERT INTO `" +sname+ "` (" + fields.join(", ") + ") VALUES (" + values.join(",") + ");"); } ```