How did we prepare the i.Experience mirror?

We recently updated the Greenfield site in preparation for:

  • An internal OEIT project to replace the video players in selected courses with SpokenMedia project player.
  • An internal OEIT project to demonstrate search through SpokenMedia transcript files for delivery via MIT’s Google Search Appliance integrated with Greenfield OCW courses.
  • A joint project with MIT OpenCourseWare to test OER Recommender/Folksemantic.com recommendations with select OCW courses.

Here are the steps we used to prepare the i.Experience mirror of MIT OCW. This is our first second implementation of i.Experience.

Notes:

  • I originally started writing this post in April, but I’ve updated it to include the most recent changes to the mirror site.
  • I expect to produce a new version of this guide in mid-/late-August as OCW issues a new version of their mirror.

Steps to prepare MIT OCW mirror for hosting as the i.Experience copy.

  • To reduce disk storage needs and make the text processing go faster, we removed all of the metadata files (corresponding to the IMS Content Packaging specification) to reduce the size of the mirror copy, and also to speed the text processing described above. This saves approximately .7GB of disk space (from 1.28GB to 0.58GB)
    • In the root of the OCW mirror copy, we executed this shell script:
      find . -type f -name "*.xml" -exec rm -f {} ;
      echo -e 'a'; sleep 0.5; echo -e 'a';echo -e 'a'; sleep 0.5; echo -e 'a';
  • The MIT OCW Mirror has hard-coded links (i.e., anchor links 'a href="/OcwWeb/...') that don’t match with how we wish to serve the site. Since we wanted to have the capability to host multiple mirror sites from the Greenfield server, we needed to address these links. There are a number of mechanisms to do it–the one we chose was to just do a search and replace for the path names:
    • We tried using shell scripting, and also BBEdit, to handle the bulk search and replaces. Both had their limitations–I could not get some of the more complex search and replaces to work with sed and I had to split the files up into 3 groups to use BBEdit (I suspect BBEdit was runnign out of memory).
    • For all of the intra-site links and image references in text files, we replaced "/OcwWeb with "/oeit/OcwWeb
    • Similarly we replaced links for a specific javascript file /OcwWeb/js/Rotatingprofiles.js with /oeit/OcwWeb/js/Rotatingprofiles.js Originally, this replacement of .js files was done as a separate step, but the above search and replace also replaces the links to the javascript files.
  • We commented out the references to the advanced search page. We replaced <a href="/oeit/OcwWeb/search/AdvancedSearch.htm">Advanced Search</a> with <!--<a href="/oeit/OcwWeb/search/AdvancedSearch.htm">Advanced Search</a>-->. We will ultimately be adding the Greenfield site to the MIT’s Google Search Appliance and restricting the search results to only queries originating from greenfield.mit.edu, but we needed to get a copy of the site online for the appliance to index first.
  • We alerted users that the download links are not working (due to the restructuring of the ocw.mit.edu website). We replaced:
    <h1>Download Course Materials</h1>
    <p>

    With:
    <h1>Download Course Materials</h1>
    <p><strong>Note: Download is non-functional</strong></p><p>
  • (This change should not be necessary in the future) We reinstated the RSS functionality–or to be more precise, we updated the RSS links so they provide the MIT OCW feeds. At some point in the future we will create a set of RSS feeds from the Greenfield site. We replaced: http://feeds.pheedo.com/oeit/OcwWeb/rss with http://feeds.pheedo.com/OcwWeb/rss/
  • To further reduce the file size impact for the mirror we are redirecting all of the links to individual resources (PowerPoint slides, PDF documents, etc.) to the live OCW website. This saves approximately 22GB of disk space (not including videos). Well, we were hoping to save space, but it turns out OCW changed their website structure in late-May 2010 and did not put in place server redirects/rewrites for these NR links.
    • We replaced /NR/ with http://ocw.mit.edu/NR/ We replaced /NR/ with /oeit/NR/
  • We added a header at the top of each page to reduce the likelihood of visitors confusing the mirror site with the live MIT OCW site. We replaced (there were 4 separate body tags we had to replace):
    <body id="global" onunload="OCWCustomResearch();">
    <body onunload="OCWCustomResearch();">
    <body onunload="OCWCustomResearch();" id="global">
    <body id="home" onunload="OCWCustomResearch();" onload=" changeQuote();">
    With:
    <body id="global" onunload="OCWCustomResearch();">
    <span align="center"><div style="text-align:center;font-style:bold;color:#ffffff;font-size:8pt;background-color:#808285;width:100%;padding:2px;padding-bottom:5px;position:fixed;top:0;left:0;">This EXPERIMENTAL mirror of MIT OCW brought to you by the MIT Office of Educational Innnovation and Technology :: <a href="http://greenfield.mit.edu/" style="color:#ffffff;">About Project Greenfield</a></div>
  • There are a few .jsp links in the left navigation and top navigation that we did not attempt to get working. We commented out the links, but left the text to preserve the look and feel.
    • We replaced <li id="lftNavActnsFeedback" class="courses"><a href="/oeit/OcwWeb/jsp/feedback.jsp?Referer=">Send us your feedback</a></li> with <li id="lftNavActnsFeedback" class="courses"><!--<a href="/oeit/OcwWeb/jsp/feedback.jsp?Referer=">-->Send us your feedback</a></li>.
    • We replaced <li id=”lftNavActnsEmail”><a href=”javascript:emailPopUp()”>Email this page</a></li> with <li id="lftNavActnsEmail"><!--<a href="javascript:emailPopUp()">-->Email this page</a></li>.
    • We replaced <li id="lftNavActnsNewsletter"><a href="/oeit/OcwWeb/jsp/subscribe.jsp">Newsletter sign-up</a></li> with <li id="lftNavActnsNewsletter"><!--<a href="/oeit/OcwWeb/jsp/subscribe.jsp">-->Newsletter sign-up</a></li>.
    • We replaced <li id="lftNavActnsCite"><a href="/oeit with <li id="lftNavActnsCite"><!--<a href="/oeit.
    • We replaced <a href="/oeit/OcwWeb/jsp/newsletter.jsp"><img src="/oeit/OcwWeb/images/newsletter_signup_trans.gif" alt="OCW Newsletter Signup" width="128" /></a> with <!--<a href="/oeit/OcwWeb/jsp/newsletter.jsp"><img src="/oeit/OcwWeb/images/newsletter_signup_trans.gif" alt="OCW Newsletter Signup" width="128" /></a>-->. (There were two versions of this code we replaced.)
  • To prevent users from emailing questions about Greenfield to MIT OCW, we commented out the email hyperlinks.
    • There were occurences in two javascript files: /js/styleswitch_search.js and /js/styleswitch.js. We replaced <ul><li class="email"><a href="javascript:emailPopUp()">Email this page</a></li></ul> with <!--<ul><li class="email"><a href="javascript:emailPopUp()">Email this page</a></li></ul>-->.
    • And, there were Contact Us links to disable (there were 4 separate versions we had to replace). We replaced <a href="/oeit/OcwWeb/jsp/feedback.jsp?Referer=">Contact
      Us</a>
      with <!--<a href="/oeit/OcwWeb/jsp/feedback.jsp?Referer=">-->Contact
      Us<!--</a>-->
  • We added a Google Analytics link to each page. We replaced </body> with <script type="text/javascript">
    var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
    document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
    </script>
    <script type="text/javascript">
    try {
    var pageTracker = _gat._getTracker("UA-9353349-4");
    pageTracker._setDomainName("none");
    pageTracker._setAllowLinker(true);
    pageTracker._trackPageview();
    } catch(err) {}</script>
    </body>

Creative Commons LicenseUnless otherwise specified, the Greenfield Website by the MIT Office of Digital Learning, Strategic Education Initiatives is licensed under a Creative Commons Attribution 4.0 International License.
Portions subject to the MIT OpenCourseWare Creative Commons License and Terms of Use.