New Online Certificate in Data Mining

Discussion in 'General Distance Learning Discussions' started by Tom Head, Aug 24, 2001.

Loading...
  1. Tom Head

    Tom Head New Member

    From CCSU's Daniel Larose, who posted this to several AI-related newsgroups:

    CCSU Launches Online Certificate in Data Mining

    Central Connecticut State University (CCSU) announces the launching of
    a Certificate program in data mining, available completely online.
    The Certificate in Data Mining is the world's first such program to be
    made available completely online, according to Daniel T. Larose,
    Ph.D., Associate Professor of Statistics and Program Coordinator for
    Data Mining at CCSU.

    The Certificate consists of four undergraduate courses, and may be
    completed in one year. The thrust of the program is to provide
    students with practical, hands-on experience with what the MIT
    Technology Review called one of the ten emerging technologies that
    will change the world (MIT Technology Review, Jan / Feb 2001).
    Students will apply such methodologies as decision trees, market
    basket analysis, neural networks, classification rules, and cluster
    detection. Students will gain strong exposure to the Clementine data
    mining software suite from SPSS, which is ideally suited to an online
    program, since student versions are available.

    The first course in the Certificate sequence begins online in
    September. Also beginning in September is an online graduate course
    in data mining, along with online courses in mathematical statistics,
    experimental design, JAVA, and calculus. Online registration is
    taking place now at onlinecsu.ctstateu.edu. For more information,
    visit the Data Mining at CCSU website at www.math.ccsu.edu/dm or
    contact the Program Coordinator at [email protected].



    Cheers,


    ------------------
    Tom Head
    www.tomhead.net
     
  2. Bill Huffman

    Bill Huffman Well-Known Member

    Some may ask, "What the heck is Data Mining?"

    Others may say, "I didn't get that Data Mining joke."

    Well it's not a joke. Let me give you a real world example that I think is interesting.

    There's a large retail chain that has a Database in the 100's of Terabytes range (a terabyte is 1000 gigabytes). This one database contains the sales detail for all the stores in the retail chain.

    Using data mining, they notice one morning that the day before two of their stores had sold a lot more drinking straws than was usual. These two had sold a lot more drinking straws than their other stores. The head office called the two stores and discovered that these stores had moved drinking straws next to the Kool-aid display because of the sale on Kool-aid that had just started. By the next day all the stores had moved drinking straws next to the Kool-aid.

    By the end of the Kool-aid sale it was estimated that the increased net on straw sales was something like 2 or 3 million dollars, IIRC. Now you can imagine that their computer system is expensive to buy and maintain but with results like that, they can't afford not to do data mining.
     
  3. Orson

    Orson New Member

    The term "Data Mining" has more questionable associations in the field of epidemiology. If you have surfed onto "junkscience.com"
    or have read books by Stephen J. Malloy (sp?)--this sites' founder--you'll grasp the pejorative use of the term.

    But since I have an old friend now at MIT teaching statistics, I'll email him and invite his comment.
     
  4. porky_pig_jr

    porky_pig_jr New Member



    I can't help but remember really funny Dilbert cartoon about data mining.

    Dogbert is a specialist in data mining. He wears a miner's outfit (a big light attached to his forehead) and is looking for some hidden messages from God.

    I would add 'knowledge management' to the same category as 'data mining'.
     
  5. Bill Highsmith

    Bill Highsmith New Member

    That would be interesting, because I don't think there is any "shady" side to data mining; there are shady researchers, however. That seems to be the jist of the anecdotes on the junkscience.com site.

    Some epidemilogists with an agenda (or the EPA as a whole) it is alleged relaxed the criteria to advance a statistical observance to cause-and-effect status. In some anecdotes, some of the epidemiologist did commit other mortal research sins as well, including ignoring evidence that did not back their agenda/research. To me, those were not data mining stories(data dredging in the prejorative of the website); they were agenda-driven research stories.

    I didn't read the mentioned book and only surveyed the anecdotes, so the author may have issues with data mining. However, the anecdotal story in another post shows how data mining is used when the only agenda is to discover new, true information from unstructured or severally structured data and make use of it...why drinking straws sold better in some stores than others.
     
  6. Guest

    Guest Guest

    I thought the main point of "data mining" is to essentially spy on people--their personal information, their buying habits, the sites they visit on the Web, etc.
     
  7. Bill Highsmith

    Bill Highsmith New Member

    Gathering marketing information is certainly one common goal, but I don't think that analyzing web visitation data meets the definition of data mining. The epidemiology post gives an example of other uses, in this case, looking at habits such as smoking and correlating them to human disease. (The identity of the smokers is irrelevant although demographic information has value.) Another example might be an expert system that calculates the insurability of an individual or a company based on statistical analysis gleened from a data mining tool.

    The reason that I think that web visitation analysis is not by itself data mining is that the original purpose of data mining was to provide data analysis tools for information stored in a wide variety of formats...unstructured text, semi-structured text of many types, old ISAM databases, and various new database formats. Frequently, the objects of the data mining were data from legacy data processing systems in corporate archives going back many years, including pre-web years. Analyzing all this data presented a difficult software development task and tools emerged to handle the various file formats and provide powerful statistical analysis.

    Looking at web statistics alone is actually a very simple task. The web servers provide nicely structured data that is easily crunched. Of course, this information could be thrown into the general data mining pool and used in combination with other information.
     
  8. Bill Huffman

    Bill Huffman Well-Known Member

    My view of data mining is that it is just one of the more recent CS/IS buzz words that are being thrown about recently. As computer hardware gets cheaper and the ability to collect larger amounts of data from different sources grows the size of databases are growing very quickly.

    The ability to do analysis on these large stores of data is quickly growing thanks to the faster cheaper hardware and the more powerful software tools being made available. The true cutting edge for these technologies is in business not in academia. I believe that it moved from academia to business between 10 and 20 years ago as SQL database systems started being developed. Data mining, OLAP, and data warehousing are well known concepts that are in widespread use throughout the industry.
     
  9. Bill Huffman

    Bill Huffman Well-Known Member

     
  10. Bill Highsmith

    Bill Highsmith New Member

    One more thought...the questionable associations might com from datacrats who use datamining without having a firm knowledge of statistics and research. A common example is someone who concludes from datamining that obesity is caused by Diet Coke because that is what obese people drink more than anything else.

    I guess it is a matter of taste whether that is just bad science or datamining gone amok.

    I guess the name is expanding in meaning. I don't know exactly when the term first appeared, but the concepts go back to about 1989 or 1990 with some "knowledge engineering" folks.
     
  11. Bill Huffman

    Bill Huffman Well-Known Member

     

Share This Page