Got it! This site uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website.nbsp; Note: This appears on each machine/browser from which this site is accessed.
Clustering is an important technique for grouping data. Some examples include the following.
Customer/market segmentation
Graphics such as computer vision
2. Exact and approximate clustering
Clustering is based on some idea of what it means for two entities to be "equal" or, in most cases, "almost equal" in some sense.
We will look at the following.
Clustering and visualization in general
Exact clustering (when possible)
Approximate clustering
3. Human visualization
Humans have a unique ability to abstract and recognize patterns and make abstract inferences from those recognized patterns.
4. Abstraction
Human brains are built for complex abstraction. What does that mean exactly?
In abstract art, something is taken away, something remains, one needs to then interpret what is meant or intended.
To abstract is to take away from the essentials and thereby to ignore certain differences.
For most purposes, an abstraction is looking at similarities and ignoring differences. The similarity is what is the same. The difference is what is different.
5. Hoare: Abstraction
In simple terms, abstraction is looking at similarities and ignoring differences.
In the development of the understanding of complex phenomena, the most powerful tool available to the human intellect is abstraction. Abstraction arises from a recognition of similarities between certain objects, situations, or processes in the real world, and the decision to concentrate on these similarities, and to ignore for the time being the differences. Computer scientist Tony Hoare, 1972 Dahl, O., Dijkstra, E., & Hoare, C. (1972). Structured programming. New York: Academic Press., p. 83. .
6. Higher level intelligence
Abstraction is the key to higher level intelligence. That is why so many questions are of the form, "What is the primary similarity and difference between ...".
Much of computer science programming languages involve looking at patterns between text and making abstractions.
7. Seeing and thinking
How many triangles do you see?
8. Kanizsa triangle
Do you see the triangle?
There is no triangle! Your brain makes the triangle that you see.
Abstraction involves looking at similarities and differences and filling in missing details - sometimes appropriately, sometimes inappropriately.
9. More triangles
How many triangles do you see now? Do they exist?
To many, the triangle that is seen but does not exist and appears brighter than the surrounding area.
10. Gaetano Kanizsa
This type of illusion was discovered/created by Gaetano Kanizsa who popularized such illusions, in part from his 1976 Scientific American paper on the subject (though he had been working on such ideas for many years before this paper).
11. With dots
The triangle is still seen when just dots are present at the corners.
Can a triangle that does not exist be "whiter than white"? White is white, right?
12. Necker cube
Here is a Necker cube. Which corner nearest to the viewer?
Do you see the Necker cube now? It does not exist except in your mind.
13. Programming abstractions
In programming terms, to abstract is to replace one or more parts of a program with a name that refers to the replaced parts (thus hiding the details). Here are some programming constructs that are used for abstraction.
constants and variables
procedures/functions with parameters
modules
objects and classes
... and many other concepts ...
14. Programming language work
Much of the work that I assign for programming language assignments follows the following general pattern. Here is the setup.
Provide example code and explanations for concept A.
Provide example code and explanations for concept B.
Here is the requirement.
Do something that combines concepts A and B into C.
By understanding A and B, including a lot of textual abstracting, one can then write/construct a program to do C.
15. Example programming abstraction
Consider the following pattern (which could go on more in an extended example).
Two times 0 is 0.
Two times 1 is 2.
Two times 2 is 4.
Two times 3 is 6.
Two times 4 is 8.
Two times 5 is 10.
Two times 6 is 12.
Two times 7 is 14.
Two times 8 is 16.
Two times 9 is 18.
A good programmer would immediately visually recognize the pattern and, if asked, could write a simple program such as the following to output that pattern.
Here is the C code.
Here is the output of the C code.
A not-so-good programmer would not see the pattern and might attempt the following.
Here is the C code.
Here is the output of the C code.
16. Comparison in general
In both cases, the program has the same output.
But the first program has less redundancy (repetition) and is considered the better program.
In general, the smaller the program that produces the same effect is considered the better program.
17. Comparison in specific terms
Specifically, the better program is smaller but also minimizes any non-computer-checked redundancy.
That means that any parameter or concept that is important in the program and that could be changed should be changeable in one and only one place.
Note: If the computer is checking the redundancy (and not a human) than that redundancy is not necessarily bad. (Backup systems are redundant but useful). And there is redundancy in a program that cannot be avoided (e.g., variables with the same name - when they represent the same or different memory locations).
18. Bloom's taxonomy
Bloom's taxonomy of educational objectives has a foundation of knowledge and remembering and a goal of abstract problem solving - evaluation and creating.
19. Music analogy
Think of music, playing a musical instrument, and creating a score of music to play. In computer terms, the musical analogy is as follows.
Using a computer (program) is listening to music.
Writing a program from a design (including pseudo-code) is playing an instrument, such as the C programming instrument, the Java programming instrument, etc.
Designing and creating pseudo-code to solve a problem is creating a score of music to be played.
20. Learning music
How does one learn music? (After listening to music for a while).
A computer (science) programming approach often used is the following to a beginning programmer.
Think of some music you to which you like to listen.
Now create a musical score for that music.
Now play that musical score on your musical instrument for which you are here to learn.
Will that work well?
How about the following.
Think of some music you to which you like to listen.
Here is a musical score (design, pseudo-code, etc.) that represents that music.
You will now learn how to play your instrument (write a program in a language that implements) using that musical score (design and pseudo-code) for the music to which you like to listen.
21. Dimensions
Humans can easily visualize 2D or 3D in graphics but higher dimensions are harder to visualize.
In data science, one often learns concepts using examples in 2D or 3D and then generalize via abstraction to many more dimensions.
Working in 2D or 3D can thus help one understand the method that then generalizes to higher dimensions.