Coding Data

Repository for Coding Data and comments about working up the data for analysis.

Table of Contents


  • Topics vs "Time" graphs are now implemented in the FINAL* :odf: odf files as well as available as PNG png and PDF pdf of the graphs themselves. -- DickFurnas - 25 Nov 2009
  • Dick - I just uploaded the updated UCSMP file. I'll be in touch soon. -- GabrielDobbs - 18 Nov 2009
  • UCSMP coding errors ch 1-8 have been corrected in FINALS_UCSMP.ods Chapters 9-13 need to be added in from -- DickFurnas - 17 Nov 2009
  • I have uploaded coding for the entire UCSMP Algebra book (Chapters 1-13). Looking forward to seeing the analysis! -- MaryAnnHuntley - 17 Nov 2009
  • The reference University Chicago School Mathematics Project Data which Gabe compiled into a master spreadsheet. The TWiki page is not pretty, but functional in the sense that it can be easily searched and the revision control allows for finding changes going forward. -- DickFurnas - 06 Nov 2009
  • The reference Core Plus Data which Gabe compiled into a master spreadsheet. The TWiki page is not pretty, but functional in the sense that it can be easily searched and the revision control allows for finding changes going forward. -- DickFurnas - 06 Nov 2009
  • The Core plus data looks great! I just posted the combined UCSMP spreadsheet. It looks to be in the same format as the Core plus data, so hopefully we can apply the formulas you were working on yesterday. -- GabrielDobbs - 03 Nov 2009
  • I separated out Book as referenced in the files from ISBN since in doing some searching and reading about ISBN numbers, they can be messier than I had hoped. I thought it true that an ISBN number uniquely identifies a book, edition, and likely pagination, but I'm no longer so sure. It turns out to be discouraged as an identifier for a bibliographic reference. -- DickFurnas - 03 Nov 2009
  • FINALS_Glencoe.ods attached. Has nearly everything parsed out into separate columns. Missing is file name and a column for the date of original source for data... and where does section number come from? the file name? -- DickFurnas - 03 Nov 2009
  • UCSMP Data Chapters 1-8 attached. -- DickFurnas - 02 Nov 2009
  • Dick- here is an initial attempt of the spreadsheet (including all Core Plus data). I'll talk to you about filling in the missing parameters soon. -- GabrielDobbs - 02 Nov 2009
  • Thanks Dick, I just had to reregister. I just uploaded some sample files as well. -- GabrielDobbs - 13 Oct 2009
  • Hi Gabriel. Here are the data sets I've gotten so far. -- DickFurnas - 18 Sep 2009
  • First Post smile -- DickFurnas - 18 Sep 2009

Initial reconnaissance.

Before getting too carried away, I wanted to reconnoiter the first column of the FINALS.xlsx spreadsheet since it seemed to have the most diverse assortment of information in it.

Here are some command-line operations I performed in on the Mac after selecting the first column of FINALS.xlsx and copying to the clipboard:

Scale of stuff to look at

pbpaste | cat | wc
   65536    9807  101150
  • pbpaste takes the contents of the clipboard and sends it to stdout
  • | pipe character which "pipes" stdout to stdin of the following command
  • cat concatenates to stdout . I use cat here defensively: cat seems to do some smart things with encodings, "conditioning" the text for use by other utilities and in this simple pipeline could have been omitted with identical results. I have encountered situations in which subsequent processing of the clipboard contents behaved better when using pbpaste if I inserted cat . It may be superfluous voodoo smile
  • wc performs a word count, reporting number of lines, number of words, number of characters * The clipboard apparently has
    • 65536 lines -- more than I want to look at smile -- Most are probably empty. That number is 2^16 and probably represents the maximum number of possible rows.
    • 9807 words
    • 101150 characters

Scale of unique stuff

pbpaste | cat | sort | uniq -c | wc
     967    2571   14253
  • sort sorts the lines
  • uniq -c finds unique lines, the -c flag says to count how many instances of each line occurred
  • I was actually wanting to see the unique lines, but by starting with wc I got an idea of how much stuff I was going to need to look at, here nearly 1000 lines.

The unique stuff

pbpaste | cat | sort | uniq -c | less
  • same as above except replace wc with less which lets me page backwards and forwards through the output.

The unique stuff of likely interest that isn't a problem number

pbpaste | cat | grep '^[A-Z]' | sort | uniq -c | less
  • similar to above, but only show lines which start with a capital letter
  • grep g eneralized   r egular   e xpression   p arser looks at lines and passes ones which match to stdout discarding non-matches
    • ^ anchors to the start of the line
    • [A-Z] matches any single character in the given range
    • single quotes to protect the search pattern from interpretation by the shell

Check the other stuff

pbpaste | cat | grep -v '^[A-Z]' | sort | uniq -c | less
  • same as above, except the -v flag tells grep to reverse its behavior, send lines which do not match to stdout
  • Why? To see if I missed anything of interest.

-- DickFurnas - 2009-09-18

Topic attachments
I Attachment History Action Size DateSorted ascending Who Comment
Compressed Zip archivezip r1 manage 214.8 K 2009-11-17 - 20:04 MaryAnnHuntley University of Chicago School Mathematics Project
Unknown file formatods FINALS_UCSMP.ods r4 r3 r2 r1 manage 860.6 K 2009-11-24 - 18:02 DickFurnas  
PNGpng UCSMP_Topic_vs_Time.png r1 manage 138.1 K 2009-11-24 - 18:00 DickFurnas  
PDFpdf CPMP_Topic_vs_Time.pdf r1 manage 496.2 K 2009-11-25 - 01:49 DickFurnas :pdf: file. Open in a pdf reader and zoom in and out to see details.
PNGpng CPMP_Topic_vs_Time.png r1 manage 133.0 K 2009-11-25 - 01:26 DickFurnas  
Unknown file formatods FINALS_Glencoe.ods r2 r1 manage 757.4 K 2009-11-25 - 01:30 DickFurnas  
PDFpdf UCSMP_Topic_vs_Time.pdf r1 manage 380.4 K 2009-11-25 - 01:50 DickFurnas pdf file. Open in a pdf reader and zoom in and out to see details.
Edit | Attach | Watch | Print version | History: r46 | r20 < r19 < r18 < r17 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r18 - 2009-11-25 - DickFurnas
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.