I’ve endeavored to model the scattering of an acoustical plane wave off of the surface of a rigid sphere. As with most computational models, we have no “reference” with which to compare the results–how can we know they’re accurate? And even though analytical solutions exist for the problem, they are (1) not perfectly implementable since they require infinite sums (which must be truncated) and (2) they are quite challenging to program, and we have no intuition for how they should look. An analysis of limiting cases and of the actual physical scenario are therefore needed to determine whether or not the results are accurate.
With regards to point (1) above, in order to practically solve the problem at hand we must truncate an infinite sum. So how do we know where to truncate? One possible method involves the addition of more and more components until the solution stops changing noticeably (for this project results are plotted and changes found by inspection), or actually begins getting worse (for more on this see my most recent post on pages.vassar.edu, specifically where aliasing errors are discussed). Keeping only the first sqrt(M) terms in the sum, where M is the number of points on the sphere for which the scattering was computed (or the number of sampling points), was found to be sufficient. This number was also found in the literature cited for this project.
For point (2), the examination of limiting cases is crucial. First, however, let’s consider what variables we’re dealing with. The primary independent variables of our scattering problem are sphere radius and sound frequency. If one imagines the relative sizes of the wavelength, though, which is essentially interchangeable with frequency, and the sphere radius, it becomes clear that there is really only one independent variable: the ratio of wavelength to sphere radius. In other words, increasing the sphere radius is the same as decreasing the wavelength of incident sound, and vice versa; the physical scenario remains essentially the same.
We therefore focus on the effect of frequency on the scattering pattern. For very low frequencies, when the wavelength is much bigger than the size of the sphere, we should expect very little scattering, since such a huge wavelength effectively “ignores” the sphere. What scattering does occur should be relatively uniform, since the sound field varies little with respect to the size of the sphere. For high frequencies, when the wavelength is much smaller than the radius of the sphere, we expect the sphere to act like a “wall” and reflect a good portion of the sound back towards the angle of incidence. For wavelengths comparable to the radius of the sphere, the results become quite interesting. We expect some sort of complex scattering pattern that is neither uniform nor immediately predictable, though we still expect the majority of the energy to be scattered in the direction of incidence.
With a look at the figure below, which shows cross-sectional scattering patterns for 200, 300, 341, 400, 600, 1000, 2000, and 4000 Hz, respetively, it is clear that these expectations are met. I think we can rest assured that these results are accurate. Note that 341 Hz corresponds to a ratio of exactly 1 between the wavelength and radius of the sphere, as the radius of the sphere was chosen to be 1 m and the speed of sound 341 m/s.
For sake of space and visualization, each graph has been normalized. In practice, the low frequencies scatter very little energy in comparison with the higher frequencies. Note how uniform the lower frequencies are, and how the higher frequencies approach a delta function in the direction of incident sound, just as expected. 600 Hz looks especially interesting; I would never have guessed such behavior at the beginning of this project.
Finally, why do we care about this work? One of the main reasons is that it provides the basis for measurements of sound energy using a spherical microphone (a rigid sphere with many microphones on it). In order to measure the sound on the sphere, we need to know how sound behaves on the sphere. A spherical microphone, using a technique called beamforming, allows us to extract directional information from a sound field (i.e. to determine from which angle the majority of sound energy is coming). This is the what I’ve attempted to do in my thesis. In addition, a spherical microphone also allows one to “reconstruct” a sound field (think virtual reality or sound maps) whereas a normal microphone does not. Future work for this project could include developing beamforming techniques, creating a system that can “intelligently” determine direction of incoming sound, and extending the scattering analysis to other shapes of interest, perhaps a hemisphere or 3D polygon of some sort.
This project was designed to create a rudimentary data analytics software for a Neuroprosthetic controller to be used in my possible 2015-2016 Senior thesis in physics. During this project, I have developed a rudimentary data analysis program which inputs data from an EMG, and calculates two key parameters: peak spike amplitude and spike duration.
Initial Goals
My initial goal for this project was to develop a data analysis program that served the following purposes:
1: Use a Fourier transform to excise the major sources of noise in the signal including movement artifacts and common appliance noise
2: Design a program which can calculate data quasi-continuously (i.e. collect and analyze data in .1 s increments)
3: From the quasi-continuous data find the peak cycle amplitudes and spike duration for each bin and sort the signal into three regimes, a contracting state, a hold state, and a state representing a “rest” position.
4: Ensure the code is easily transferable to other formats / coding languages such that it can be used elsewhere in my thesis.
What I did:
1: The code was successful in sorting and solving for the peak amplitudes and spike duration of the experimental data. However the spike duration is different from the cycle duration (which is more useful) and is not currently calculated. However, the code can easily be adapted to solve for this. The reason it has not yet been modified is that the thresholds and methods by which cycles are delineated are unclear and are part of the larger thesis experiments, therefore they do not fall under the purview of this project.
2: Although the code developed for this project specifically and attached to this submission is unable to run quasi-continuously, the architecture implemented can easily be arranged such that it does run quasi-continuously.
3: The code does successfully sort the data into each of the three regimes, and “tags” them numerically by adding whole numbers to distinguish each of the regimes (0.xxxxx is a contracting state, 1.xxxxx is a hold state, 2.xxxxx is a rest state)
4: The code is easily transferable to other coding languages and formats. With the exception of the Fast Fourier Transform command and language specific syntax, the commands used in the code are language independent. This will allow the code to be transferred to Arduino, which controls the movement of the arm itself streamlining and speeding the execution of each of the experiments.
Methods
The code and parameters for which it solved for were inspired primarily by two papers: Filtering the surface EMG signal: Movement artifact and Baseline noise contamination by Carlo De Luca (2010) and Modulation of in vivo power output during swimming in the African clawed frog by Christopher Richards (2007). The code was written in Matlab, and 3 sample Human EMG data sets were found from a medical database online.
The program has three main sections. First, matrices are created to store the relevant results and initial parameters of the system. Then the input EMG code is chopped into bins .1 seconds in length and each bin progresses through the rest of the code sequentially and independently. From this, the chopped data is passed through a Fourier transform, and the major sources of noise are excised. The excised noise sources are as follows: <20 Hz noise originating primarily from movement artifacts and low frequency pressure waves and 60 Hz noise stemming from most electrical appliances and lights.
The second section solves for the spike amplitudes in each bin. Using the findpeaks command, the locations and magnitudes of each of the spikes are solved for using nested if/else and while loops. The locations of each of the peaks are passed into the third section, while the peak amplitudes are saved as one of the output parameters.
The third section solves for the spike duration and splits the signal into each regime. To do so, a series of nested if/else and while loops are set to proceed from each spike and the beginning of the bin. First, the spike durations are calculated using the locations of the peak amplitudes from the previous section of code. After the spike durations are found, the code cycles back and calculates the regimes where the arm is considered to be either holding or going to a rest state. Any excess time within the bin created as a result of conditions to maintain code stability is then characterized as a hold state, which will allow the physical Neuroprosthetic arm to catch up to the computation in the physical system. The final output durations are then tagged according to which regime they belong into and the results are output in a cycle durations matrix for each bin.
A fourth section, which is only utilized if no spikes are detected in the current bin, goes through the same procedure of finding the hold and rest regime durations for the inactive bins.
In the code graphs of the original signal, comparisons of the rectified and original signals for each bin and the total program are created to allow the experimenter to have a visual verification of the efficacy of the code.
Results:
As discussed previously, the code was both successful and stable. Following the Fourier noise rectification, the signal was clearer and more legible than the original signal, and the data was still maintained. The peak amplitude outputs are accurate, and are positioned in the output matrices identically to their corresponding spike duration for easy identification and analysis. Similarly, the code successfully found the regime durations in each bin, with hold, contracting and resting states all distinguishable and present in the appropriate locations and placed sequentially.
Analysis of the output signal figure will shows the noise reduction caused as a result of the Fourier noise analysis, and the table, showing the output cycle duration matrix, shows how cycle durations were successfully found. To see larger images, please see either the google drive folder associated with this project, or click on the images themselves.
Conclusion and Future Experiments
I am satisfied with this initial foray into Neuroprosthetic data mining. The data outputs were accurate and returned the information that I sought to solve. Furthermore, the architecture of this code is easily transferable to other formats and coding languages, and can be modified to solve for other experimental parameters dependent on the needs of the other experiments.
Although in and of itself, this experiment is extremely successful, and wide number of individual parameters can be calculated by altering the analysis code slightly, the architecture itself may prove unsatisfactory in later experiments. The reason for this is that this code currently only processes data stemming from a single data input. However, there are a number of models for nerve activity that are reliant upon populations and pre-contraction initial conditions (Churchland, 2012), which are significantly more complicated and cannot be solved for using the current architecture. However, I believe that it is not necessary to use that kind of population level model, in which case this architecture is sufficient for later work.
Further modifications to this code will be to change the parameters that the code solves for, as well as transferring the code Arduino such that it can be installed directly into the robot arm. Furthermore, there is a significant amount of information lost through data processing in the name of preventing crashes, which is an inefficiency that I intend to remove in the future. The code developed for this project will serve as an initial, rudimentary data mining code which can be used in a number of different ways elsewhere in my future projects and investigations into neuroprosthetics.
Relevant Files can be found through the link below with headers:
1: De Luca, C. Gilmore, D. Kuznetsov, M. Roy, S. Filtering the surface EMG signal: Movement artifact and Baseline noise contamination. Journal of Biomechanics (January 5, 2010) 1573-1579
2: Richards, C. Biewener, A. Modulation of in vivo power output during swimming in the African clawed frog. Journal of Experimental Biology 210 (2007) 3147-3159
3: Lichtwark, G. Wilson, A. A modified Hill muscle model that predicts muscle power output and efficiency during sinusoidal length changes. Journal of Experimental Biology 208 (2005) 2831-2843
4: Computational Physics by Nicholas J. Giordano and Hisao Nakanishi
5: Churchland, M. et. al. Neural population dynamics during reaching Nature (2012) 51-56
As a fairly precise branch of physics, as well as one of the oldest, there are many problems in celestial mechanics for which an analytical solution is well known. Still, some problems exist for which an analytical solution is either very difficult or impossible to obtain. For the former, a computational analysis serves to verify analytical theory; for the latter, simulation may be the best way of understanding a problem other than with direct observation. The goal of this project is to examine two celestial mechanics problems with a computational model: the precession of the perihelion of Mercury and Kirkwood Gaps.
Precession of Perihelion
For centuries astronomies had a prediction of Mercury’s orbit from Kepler’s laws which measurably differed from observational values. What astronomers found was that Mercury mostly obeyed Newton’s inverse square law for its motion, but deviated because the orientation of the orbit’s axes was rotating over time. This is referred to as the precession of the perihelion, because the perihelion (the point in the orbit at which Mercury is closest to the Sun) lies along the semimajor axes. The source of the deviation from theory was a mystery until Albert Einstein released his paper on general relativity in 1915, in which he claimed his new theory explained the precession of the perihelion of Mercury due to the relativistic gravitational effects from the other planets (particularly Jupiter). Einstein’s adjustment to the inverse square force law is as follows:
The value for alpha shown above is the value calculated from general relativity; for this simulation, a much larger value of alpha is used so that the simulation doesn’t have to run as long to show precession. The value used in the main code of this program is .0008.
We then use the force law to calculate the velocity and position of Mercury. The derivation for the position and velocity with respect to x is as follows:
The derivation is virtually identical with respect to y. We then use these differential equations to create a computational algorithm; I chose to use the Runge-Kutta method because I felt that updating the values within the orbit computation loop would lead to a more accurate simulation.
A representative graph is also included to show how the orbit changes over time.
Now that we have a code for the movement of Mercury, the next step is to find where the perihelions of each orbit lie. I did so by first conceptualizing what happens at the perihelions. Because the perihelions are the most rounded parts of Mercury’s orbit, the change in the distance from the Sun will be very small or slightly negative as Mercury moves closer to the Sun. Using this knowledge, I then find the position at these points and convert it to an angle. Angle is then graphed against time to show how the perihelion precesses over time.
The goal of this graph is to find the fit slope of our angles so as to compare our computed precession rate to the analytical precession rate per unit alpha, 1.1 * 10^4 degrees per year per unit alpha. Using this value, we should be able to multiply it by the value of alpha predicted from general relativity to find a precession rate of about 1.2 * 10^4 degrees per year, or 43 arcseconds per century.
Continuing the program, I ran the precession simulation for values of alpha from .0001 to .004. I then created arrays to hold these alphas and their corresponding precession rates per alpha and plotted both with a fit line. The slope of this graph multiplied by alpha from general relativity is compared to the analytical values.
The slope calculated from the angle vs. time graph was 8.4516 with an r value of .749. Therefore the precession rate per alpha here, calculated by slope/alpha, is 1.067*10^4 degrees/year/unit alpha, which compared to the analytical value is a relative error of 2.9%. This precession rate per alpha is then calculated again in the slope of the second graph, which comes out to 1.1139*10^4 degrees/year/unit alpha, a relative error of 1.9%. Multiplying by Einstein’s alpha, we find a precession rate of 1.2253*10^-4 degrees/year or 44.11 arcseconds/century, with relative errors of 2.1% and 2.6% respectively.
Kirkwood Gaps
When astronomer Daniel Kirkwood plotted the number of known asteroids against their orbital radii, he found significant “gaps” of asteroids- orbital radii at which few or no asteroids were found to exist. Eventually he discovered that these orbital radii were in resonance with Jupiter, the largest planet and, as seen by the precession of the perihelion of Mercury, therefore the most likely to perturb the orbits of other objects in the solar system. This resonance is a ratio of the orbital periods for two objects rotating around the same central body of mass, and is related to the orbital radii of the object. So in this example this project, the orbital period of an asteroid located at the 2/1 resonance point with Jupiter will be one half that of Jupiter’s, so it will complete two orbits for every one Jupiter completes. The relation between orbital period and orbital radii is given by Kepler’s Third Law:
where P is orbital period and a is orbital radius; this proportion holds for all objects orbiting the Sun. Knowing that Jupiter’s orbital radius is 5.2 AU, we can calculate the position of asteroids at resonance points by this ratio; i.e. at the 2/1 resonance, the orbital radius is 3.27 AU, 2.5 AU at the 3/1 point, and so on. We can also use the calculated radius to find the initial velocity by the equation for orbital speed:
The reason why Jupiter perturbs the asteroids so much at these radii is due to resonance- because the object in resonance is closest to Jupiter at the same several points in its orbit (for example, twice in the orbit of the 2/1 resonance), the perturbations from Jupiter will be largest only here and cause highly erratic orbits over time. The same phenomenon doesn’t happen at other radii because Jupiter’s perturbations occur at random points on the asteroid’s orbit and altogether cancel each other out over time.
My code begins with the 3-body system as described on page 115 of Giordano and Nakanishi using the Euler-Cromer method. First I tested my code to demonstrate it worked without adding any asteroids:
Then I added asteroids at and around the 2/1 resonance point. Normally when adding more bodies to such a system, the new bodies interact with and change the orbits of the original objects, but because the mass of the asteroids is negligible as compared to all 3 objects already in the system (the Sun, Earth, and Jupiter), this effect can be ignored. The highly erratic 2/1 resonance is shown in blue:
The other two asteroids demonstrate how little Jupiter perturbs their orbits. I then added asteroids at the 3/1, 7/3, and 5/2 resonance points, but found a very different effect:
I originally thought that this was an error in the simulation, most likely caused by too many objects in the system, but I am now confident this is an accurate approximation of the involved physics; the explanation is in the following section.
Conclusion
Because the precession rate of Mercury has already been found analytically, it hardly comes as a surprise that the computational solution closely matches both theory and observation. The small relative errors compared to the analytical values shows the program is successful. While figuring out how to find the perihelion as it rotates was difficult, the rest of the program is straightforward because it follows from the known force equations. The program also demonstrates how known physical values can be altered to make computation easier, such as using a much larger alpha than in reality so as to speed up the calculation. While it might be more physically realistic to calculate the precession rate using a system with more massive bodies, at least including the largest perturber Jupiter, the calculated values are good enough to conclude the current simulation is sufficient in verisimilitude.
The program for the Kirkwood Gaps demonstrates how one very particular physical property, resonance, can have a very large effect when massive bodies are concerned. The first asteroid graph demonstrates this quite effectively, as the asteroid found in the 2/1 gap is perturbed much more than the outermost asteroid which is significantly closer to Jupiter. The graph of the other resonance points demonstrates exactly why it is that the Kirkwood gaps exist: eventually the perturbations from Jupiter become so large that the orbiting object is ejected entirely. The asteroid at the 2/1 resonance point isn’t ejected because it has the lowest frequency of interaction with Jupiter. At the other resonance points, Jupiter perturbs its orbit enough to eject it, because of the higher rate of encounter- the asteroid at the 3/1 resonance point is effected by Jupiter one more time than the 2/1 asteroid per orbit, a difference which has significant consequences over time. This program serves as a successful example of how a computational simulation can lead to greater understanding of the underlying physics even when an exact answer isn’t possible or even necessary. It is also worth noting that a phenomenon similar to Kirkwood gaps occurs with the rings of planets; for a real-life example, the rings of Saturn have gaps in them because the gaps are in resonance with one of Saturn’s moons.
The precession of Mercury is caused by the gravitational effects of the other planets (Jupiter contributing the most) and the general relativistic effects of the Sun. In the first part of my project I set about achieving the value of the precession of Mercury using the Euler-Chromer method to simulate Mercury’s orbit and least-square linear fits of and precession rate vs. . I achieved a value of , with a percent error deviation of 10.39 from the true value, .
In the second part of my project, I let the eccentricity of the orbit of a Mercurian-sized body vary in its orbit in order to observe eccentricity’s effect on the precession rate. I determined that precession rate is inversely proportional to . The values of and eccentricity I used are included in a table attached to this post.
This past week I experimented with 3-body systems in order to determine the effect of Jupiter on Mercury’s precession. Using the Euler-Chromer method I was able to successfully model a Jupiter-Mercury-Sun system, but only when the Sun’s position and velocity were held at 0.
For the case in which the Sun was to be given an initial position and velocity, I input parameters that should have allowed the center of mass to remain constant, however the graphs showed the Sun escaping from the solar system. Unfortunately I’m not sure how this error occurred, however I am convinced that this model is necessary to determine Jupiter’s effect on Mercury’s precession, as I was unable to achieve the correct value with the model in which the Sun’s position and velocity were held constant.
Link to Code:
https://docs.google.com/a/vassar.edu/document/d/1DrX3F-mnJsyuq8FPltXRY8aCFyNxQvoUGUG_BoIpiBE/edit?usp=sharing
Link to Eccentricity Data:
https://docs.google.com/a/vassar.edu/spreadsheets/d/1YNlhVNkyhy-arqPQW5dAoME6pCHRKIlXpKaZ6bisG-s/edit?usp=sharing
Our final program takes an acoustic wave signal, strips out all audible frequencies using a Fourier transform, then shifts a range of the ultrasound frequencies into the audible range in order to qualitatively investigate the properties of the audio signal. Our initial goal was to record our own ultrasound frequency signals for analysis, but we were not able to procure a microphone sensitive enough to record ultrasound. Instead we choose an online video claiming to repel rodents by emitting an audio wave with ultrasound frequencies that effect rats (found here: https://www.youtube.com/watch?v=l5x1Bb9UTT0). On first inspection, the signal seemed to only play what resembled white noise, but we needed an accurate representation of the frequencies present in the audio file before making any more assessments. We hypothesized that the video did not contain ultrasound frequencies capable of repelling rodents.
Investigation:
Using the frequency analysis methods we implemented earlier in our project we found that the video actually contained frequencies above 20 kHz. But we were still skeptical because these higher frequencies were about 80% less powerful than the frequencies present in the audible range. This may have been a tactic used to trick the user in believing that high frequency sounds were playing. We wanted to examine what the high frequencies sounded like so we took a 5 second clip from the audio file to scrutinize.
(Please find the 5 second clip here: https://drive.google.com/file/d/0B4quZV7NVNf7YU16blYwcUxXYXc/view?usp=sharing)
We started by removing any audible frequencies from the clip then shifted the ultrasound range into the audible. The resulting sound was again white noise, but at a lower volume.
(No audible frequencies: https://drive.google.com/file/d/0B4quZV7NVNf7clBFTlJIT1I5bzQ/view?usp=sharing , Shifted signal: https://drive.google.com/file/d/0B4quZV7NVNf7amhqVkZ6NjBtUXc/view?usp=sharing)
A qualitative analysis of this new wave does not lead us to believe this video would repel anything. White noise is not an abrasive sound unless played at extremely high volumes, but anything played at a high enough volume can be found abrasive. White noise is generally considered to be the exact opposite and there are white noise generators that are sold on the premise that they are calming and relaxing, and are used to help people focus or sleep. The video we analyzed is not exactly what we would expect in a rat repellent, it seems like this video aims to be more of an aid that would assist in rat mediation.
Here is the clip from the original 5 minute video. Notice the amplitude of this signal.
Here are the various operations we performed on the Fourier transform of the clip.
No audible frequencies are present in this new signal. Please note the difference in amplitude in this signal from the original signal.
The new, shifted audio signal.
We then decided to make a code to demonstrate what a real rodent repellent signal might sounds like. We start by assuming that sounds that are irritating to humans are also irritating to rodents. A sound that is often considered to be annoying to us is a pair of high frequency sinusoidal waves that are similar in frequency but not identical. This is a very dissonant sound, and even at low volumes can grab the negative attention of someone around you (which can happen if you accidentally sound them in a public space while coding a rat repellent). So we also built a code that generates what we consider to be an abrasive sound, but is well above the audible range so we cannot hear it. We decided on the frequencies of a distressed rat baby (http://www.ratbehavior.org/rathearing.htm). Unfortunately, Matlab’s audiowrite function was not able to save this as a .mp3 file, so please download the code play and use the sound command found to play it on your machine.
If we were to continue to research methods and applications frequency shifting we would write a code that transposes frequencies. Transposition maintains the harmonic ratios of the frequencies present in the wavefunction by multiplying them by a constant instead of linearly shifting them. Changing the sampling frequency at playback also maintain these ratios but in doing so, distorts the audio signal by stretching or compressing the wavefunction. Out next code would have been to transpose frequencies without changing the length of audio sample. We would also like to to detect our own ultrasound frequencies with an ultrasound microphone for qualitative analysis.
Our initial plans for this project were to model and guitar string, piano string, and drum head in MatLab, then compare data calculated to a real scenario, using the physical instruments and recording the sounds they create. We were able to complete the guitar and piano strings, but were not able to advance to the drum head due to time constraints. However, we did get a lot of good data from our experiments with the guitar and piano strings.
Guitar String
For this part of our project, we modeled a guitar’s B string in MatLab, and plucked the string at different points: once at the midpoint of the string, once 1/5 the string length from the bridge, and once 1/20 the string length from the bridge. In the proceeding images, you can see the string oscillations, the force of the string on the guitar bridge with respect to time, and the Fourier Transform of these force graphs, which reveal the frequencies inside.
Modeled Guitar Data:
Frequencies of waveforms (Fourier Transform):
The force of the string on the bridge with respect to time is nearly identical to the sound wave that the string produces in the atmosphere, thus we were able to use that data to find the frequencies in our calculated string. We also shifted the frequencies on the graph of the Fourier Transform for visual purposes (if not shifted, they would line up on top of each other and be difficult to distinguish).
Next, we recorded a guitar B string, identical to the one we modeled, and Fourier transformed those recordings to obtain the frequencies inside.
Frequencies of Physical Guitar String:
Here are the recorded guitar sound files. Note as the pluck gets closer to the bridge, the tone of the guitar becomes harsher. We will discuss this soon.
String Plucked at (1/2) distance:
String Plucked at (1/5) distance:
String plucked at (1/20) distance:
In our code for the modeled guitar string, the calculated waveform is played for the viewer, and as seen from the physical guitar, the tone of the sound gets harsher as the pluck gets closer to the bridge. This harsher sound is the stronger presence of the overtones in the waveform as seen in our Fourier transforms. As the pluck gets closer to the bridge, strength of the higher frequencies increases, which is why the sounds produced becomes much harsher. This is what we expected and found with this part of the experiment. Comparing our Fourier transform graphs, we see that the peaks at the higher frequencies increase with each pluck closer to the bridge. We can also see that the closer the pluck is to the bridge, the lower the fundamental frequency becomes present in the produced sound.
However, there are some inconsistencies between the modeled and physical strings. The modeled string did not contain and decaying properties, where the physical string did. Also, the modeled string data was calculated using two dimensions when the physicals string would utilize all 3. This important difference shows in the Fourier Transform of the modeled string. When the modeled string is plucked at the halfway point, every other frequency (or peak) is eliminated from the Fourier transform; same as when plucked at 1/5, the fifth and tenth peak is eliminated. In this modeled state, we are effectively killing these overtones by plucking where they would naturally occur on the string. The decay properties of the physical string allow these “ghosted” frequencies to sound because the string can move freely rather than stay in a fixed 2-D dimension and fixed wave profile. Another reason behind this, could be that the physical string is not of ideal form, meaning it is a series of coiled wires rather than a solid string. This could have imperfections giving rise to these “ghosted” frequencies.
Another small discrepancy in our data is that the fundamental frequency of the physical string plucked at the halfway point is lower than when plucked at the other distances. We don’t see this in our modeled system because the amplitude of the pluck was held constant for each pluck (giving us the results we expect). On the physical string, the string is able to bend more at the middle than closer to the bridge. Thus to get an equal volume pluck, we must do a harder pluck in the middle than closer to the bridge. If we kept the amplitude of the plucks for the physical guitar the same, we would expected the power of the individual frequencies to closer match the modeled ones.
Piano String
For this part of our project, we modeled a piano’s middle C string in MatLab, and struck the string with different forces: Once softly, and once more powerfully. In the proceeding images, you can see the string oscillations, the force of the string on the piano bridge with respect to time, and the Fourier Transform of these force graphs, which reveal the frequencies inside, all for two different hit intensities.
The model included a hammer with mass and an original velocity (which changes between the two trials) as well as a felt on the hammer which behaves as a spring. Here the compression of the felt is dependent on the position of the hammer and the string section where the hammer strikes, and which affects the positions of the freely moving hammer and string based on its compression. When the compression changes signs, we know that the hammer has left the string and we automatically give the string no outside force.
Modeled Piano Data and Frequencies
The force of the string on the bridge with respect to time is nearly identical to the sound wave that the string produces in the atmosphere, thus we were able to use that data to find the frequencies in our calculated string.
Next, we recorded a piano C string, identical to the one we modeled, and Fourier transformed those recordings to obtain the frequencies inside.
Frequencies of Real Piano String
Here are the recorded piano sound files. Note when the strike is harder, the tone of the guitar becomes harsher. We will discuss this soon.
Hard String Hit:
Soft String Hit:
In our code for the modeled piano string, as the strike gets harder, strength of the higher frequencies increases, which is why the sounds produced becomes much harsher. This is what we expected and found with our simulations. A similar trend was also found in the Fourier transformed recorded sound files.
However, there are some inconsistencies between the modeled and physical strings. One such difference was the lower fundamental frequency than the second overtone in the recordings, as compared to our model. This could be due to the amplifying qualities of the piano box. The modeled string did not contain and decaying properties, where the physical string did. Also, the modeled string data was calculated using two dimensions when the physicals string would utilize all 3. The model we simulated made assumptions about the spring like qualities of the felt, the width of the hammer, as well as the strike intensity, which we couldn’t make equivalent to the recorded strikes. This created some strange differences, including a strangely sharp hammer hit (not as soft as one might expect from a spring, as well as differing from the textbook’s predicted hammer behavior). We brainstormed what may be it’s cause (the hammer velocity, the original felt thickness, the hammer’s strike position) but couldn’t find a solution.
Conclusion
In conclusion our results were what we expected with some degree of uncertainty. String plucks closer to the bridge and harder string strikes produce more overtone notes relative to the fundamental frequency. If we had more time/if this project were to continue further, we would explore the addition of a damping coefficient in our modeled experiments and compare with physical strings, investigating the discrepancies mentioned earlier. We would also begin to explore the drum head dynamics, experimenting with two 3-D membranes, connected through a closed hoop, similar to the structure of an actual drum.
The following link is to a google drive containing all of our code and sound files.
Please note, if you are planning to run the code that involves the recordings of the instruments, YOU MUST CHANGE THE DIRECTORY IN WHICH THE CODE CALLS THE SOUND FILE. Each space in the code is noted where this part should be edited. This needs to be done as the naming of folders and drives varies from computer to computer. Not only must the correct location of the file be used, but the sound files must be saved in the same location as the MatLab code or it will not run. Also the correct sound file must be called in each spot, or the sampling will be off/ the code will not run properly.
Over the last few weeks, we have been working on a computational project on molecular dynamics. This post below summarizes the details of this project, and the overall goals we achieved.
Initial Goals and Motivations:
We wanted to do an experiment in molecular dynamics as it connects to several things we have studied in our previous physics classes. We used our knowledge on Newton’s second law which we learned in classical mechanics; we used Boltzmann’s distribution and the concepts of temperature from thermodynamics and statistical mechanics; and we used the Lennard-Jones potential which has its root in quantum mechanics. All these features made simulating molecular dynamics an attractive, challenging, and a very educational project.
We had several goals when we started this project. We wanted to be able to simulate a 2-D gas system, and be able to study several of its properties like temperature, energies, speed distributions, and visualization at equilibrium. We then wanted to build up on this system to make it into a 3-D system. We were also thinking about making it possible to increase the temperature of the system and observe how long it takes to come back to equilibrium. This was supposed to be a simulation of periodic heating of gas particles with a laser beam.
Goals We Completed
Due to time constraints, we were unable to fulfill some of our initial goals. However, we were able to finish most of the goals. The goals we met for this project include:
Simulating the motion of a dilute sample of gaseous Argon atoms in two dimensions.
Calculating the mean velocities of each atom and plotting their speed distribution in each simulation.
Determining the temperature of the sample based on initial and equilibrium speeds of each atom, which required a determination of the kinetic energies of all atoms at each time step.
Comparing the calculated speed distribution with the Maxwell-Boltzmann speed distribution curve.
Completing the simulation and the above calculations in three dimensions.
Introduced possibility to simulate temperature change in the system.
Methods
Section 9.1 of Giordano and Nakanishi’s Computational Physics was our primary reference for this project. We used MATLAB for the simulations, dividing our code into several functions (subroutines) instead of using one script. The codes and their purposes are outlined below.
TheInitiator.m
This function initializes an evenly spaced grid in 3-D space (2-D in the old version of the program). This grid serves as the basis for the location of each atom, because we want the atoms to start at roughly the same location for each simulation. Two further steps are taken to make sure the simulations are not fully deterministic: each atom was displaced slightly from a point on this evenly spaced grid, and it was given a random velocity (within some limits) in each dimension. This function also allows an option to have different limits of the initial random velocity which can produce an effect similar to starting with gas particles with different temperature.
TheEnforcer.m
This function calculates the force on each atom due to every other atom, based on the Lennard-Jones potential, which is ineffective at large distances, attractive at moderate distances, and repulsive at very short distances. This calculation takes exponentially longer for additional particles, because each particle interacts with each other particle.
TheUpdater.m
This function uses the forces from the previous function and each particle’s positions to calculate each particle’s position and velocity at the next time step. It feeds these positions back into TheEnforcer.m and iterates until the positions and velocities of each particle for each time step are calculated. This function also enforces the periodic boundary conditions (that a particle leaving one side of the “box” comes back through the other side, with its velocity unaffected). Particles that are closer to each other through these boundary conditions than they appear must interact as though they were at this close distance. To clarify this, consider a one-dimensional system, ranging from x=0 to x=10. Two particles at x=1 and x=9 should interact as if they are 2 units apart, not 8 units apart. Implementing this system properly was one of the most difficult tasks we faced.
SpeedDist.m
This function takes in each velocity component of each particle at each time step, and first calculates a speed for each particle at each time step. It plots three speed distribution curves by averaging each particle’s speed over each third of the runtime, placing these speeds into bins, and plotting a histogram that is visualized as a curve. The function is configurable to plot any integer D speed distribution curves by averaging each particle’s speed over each 1/D of the runtime instead, but D = 3 effectively showed that the system had reached equilibrium (signified by when two successive speed distribution curves are fairly similar).
This function also plots the kinetic, potential, and total energy averaged for all particles at each time step in the simulation. The kinetic energy was needed to calculate temperature, since the product of Boltzmann’s constant and the system’s temperature is equal to the average kinetic energy of all particles in the system.
MainBody.m
This script simply initializes the spatial dimensions of the system and how many atoms will be simulated. This is the code that is directly run by the user to visualize the simulation and look at the speed distributions and energies.
We succeeded in producing visualizations of the Argon atoms in two and three dimensions. Even with 100 atoms, the script does not take more than thirty seconds to run, so more particles could feasibly be simulated. Additionally, our speed distributions matched up relatively well with the actual Maxwell-Boltzmann distributions in terms of horizontal location (this shows that our determination of temperature was accurate, since this is what affects the horizontal position of the peak (most common particle speed). However, our generated speed distributions were more narrow than the Maxwell-Boltzmann distribution. It is possible that they could be widened by decreasing the density of the atoms (simulating fewer particles or making the dimensions of the “box” bigger). It is also possible that increasing the maximum initial velocity would increase the width of the speed distribution, since it would give each particle a wider range of initial velocities to have.
We were mostly satisfied with our energy plots; they showed a fairly constant total energy, which we expect (it is not completely constant because the total energy is calculated from averages of kinetic and potential energy of all particles). Kinetic and potential energies became steady as the system reached equilibrium, which takes longer for lower gas densities.
One example of a speed distribution plot (click for full-size):
One example of an energy plot (click for full-size):
Conclusion and Future Potential
Based on the Boltzmann’s distribution plots of our data, we are fairly confident that our simulation is useful at approximating systems of gas molecules. The fact that the total energy of the system remains around a fixed value is also promising. However, the fluctuations in the total energy when the particles start interacting is still not ideal.
One limitation of our problem would be that the system destabilizes when the initial density of our gas molecules are very high. This is a direct result of the steep increase in the Lennard-Jones potential at low distances between particles.
We want our readers to see this code as a work in progress rather than a complete program. The program still has a lot of unrealized potential. For example, the energy graph produced by the simulation for cases with different initial velocity limits could potentially be used to calculate time taken by a system with gases of different temperature to equilibrate. Similarly, with some changes to the initiator and the enforcer function, the program could also be used to simulate a collection of different gases as opposed to a single type of gas molecule. Although we were unable to simulate it, increasing the density and lowering the initial velocity should also make it possible for future users to simulate materials in solid and liquid phases. Since the program already has method to change temperature built into it, some changes could be made to make these temperature changes periodic. This could potentially be used to simulate a laser beam heating the gas particles.
Overall, we are very satisfied with the progress we were able to make throughout the semester. Apart from learning a great deal about different physical concepts, we also developed a deeper appreciation and knowledge of computational approaches towards studying physical systems. We also think that we have developed a fair understanding of the limitations and the potential of the computational tools available to us, and we are confident that this project has prepared us for many more simulations that we are likely to encounter in our careers as physicists.
For our project we decided to model the perturbations of Earth’s orbit caused by Jupiter. We chose to study the effects of Jupiter because it is the only mass nearly comparable to the sun (about 0.1%). We started with a routine that modeled a single planet orbiting a fixed Sun, and built up to a true three body simulation with the Earth, Sun, and Jupiter bound gravitationally.
Our end goal was to write a three body simulation for the planets because we planned to observe the effects of an increased Jupiter mass on Earth’s orbit. At a certain point the “fixed-Sun” model becomes an unphysical approximation. Three body problems are nearly impossible to solve besides restricted problems, but quite possible to observe computationally.
Routine I: Planet Orbit
Our first model just contained a single planet orbiting around the sun. Mathematically, this model only relied on the inverse square law of gravity to relate the two bodies. Since we know from the physical system that sun is far more massive than any of the planets, we know that they have a negligible effect on the sun’s orbit. Because of this we simply fixed the sun at the origin of our system, and only modeled the movement of the planet around it.
We chose the Euler Cromer method as our primary tool because the orbit of a planet is an oscillatory motion and energy needs to be conserved over each period. This is comparable to the pendulum problems from earlier in the semester where we ran into the same issue. With the Euler method energy is added each period, so we demonstrated here that the Euler-Cromer method conserves energy with each orbit.
Euler Cromer Euler
Thus, the only thing affecting each orbit is actually the initial conditions, we are essentially approximating all of the orbits except mercury as circular because their eccentricities are very close to zero. Due to this nearly circular motion, the gravitational force is equivalent to the centrifugal force of a body in uniform circular motion and we are able to calculate the initial velocity for the planet when it starts at a distance of its semi-major axis from the sun. We will observe circular motion unless the initial velocity is wrong.
Routine II: Jupiter-Earth
Already having completed a foundational program for a gravitationally bound singular planet simulation, it was not much of a reach to add in a second planet for our second routine. The exact same physical equations were used and we continued using the Euler-Cromer approximation method. Initially, once Earth and Jupiter are related gravitationally, we observe no noticeable difference in the Earth’s orbit, which matches our physical observations. This program is very useful however because it allows us to adjust Jupiter’s mass in order to observe how the Earth’s orbit perturbs without affecting the sun’s motion. While this was very useful as an initial exploration into the perturbations of Earth’s orbit caused by Jupiter, it was not complete because it still didn’t take into account the motion of the Sun when Jupiter’s mass was very large. This was the goal of our next routine.
Routine III: Three-Body
The difference between the three-body routine and the earth-jupiter routine is that we no longer fix the sun in place and instead calculate the effects of Jupiter and Earth on its motion. We also made several changes to the initial conditions to make the model work, such as changing the origin to be the center of mass of the system and giving the sun an initial velocity so that the center of mass does not drift. We also set it so that the initial velocity of sun changes with increase in the mass of Jupiter in order to maintain these conditions.
Observations:
These simulations demonstrated that the Earth’s orbit is very stable even up to many times the current mass of Jupiter (as would be expected from observational evidence).
Upon observing the effects of changes in Jupiter’s mass for this system, we were able to conclude that the final three-body system becomes chaotic at sufficient Jupiter masses. Small changes in Jupiter’s mass caused entirely distinct orbits to form. In many of these systems, the motion appeared completely random, sometimes resulting in the Earth’s ejection from the system. Just like the pendulum, below a certain driving force the pendulum’s motion was still predictable, and above a certain force the motion became chaotic.
This threshold for chaotic motion was clearly demonstrated when the mass of Jupiter approaches the sun’s mass (x1000). The system starts to look very much like a binary star system around a center of mass. In this scenario earth’s orbit is VERY unstable and chaotic, completely unpredictable, and extremely sensitive to initial conditions. When Jupiter’s mass is sufficiently small to never directly interact with earth (AKA never crosses or comes near the path of the sun’s motion) then the motion of Earth is not chaotic.
To download our code, please click the following image:
Each musical instrument has its own characteristic sound due to the contributions of different harmonic intervals above the fundamental frequency. The relative strengths of these harmonics are due to the mechanism of sound production and transmission of the instrument. For my project, I modeled the sound generation of two musical instruments, the harpsichord and clavichord. Building upon content in the class text (Giordano and Nakanishi, specifically chapters 6 and 11), I simulated the plucking action of the harpsichord for different values of several input parameters, compared computational results to data from a real instrument, and designed and executed a computational model for the clavichord.
Mathematical Background
For both instruments, the initial wavefronts evolved in time via
where i is the spacial index of the wire’s height y, n is the temporal index, , Δx is the spacial step size, and Δt is the time step size. The parameter c is the speed of the wave on the string, and relied on several parameters of the string via
where T is the tension on the string, R is the radius of the string, and ρ is the density of the string material. The final key equation needed is for the tension:
where L is the length of the string and is the fundamental frequency of the note desired. It should also be noted how to calculate the force of the string on the bridge (where sound is transmitted to the soundboard and then the air) using eq. 11.4:
Harpsichord
The computational model for the harpsichord was identical to that of a guitar; both are plucked string instruments that function in essentially the same way. Therefore, the initial wavefront was triangular. The plucking ratio β, a parameter used to set up the wavefront, proved to have the most effect on the frequencies generated (see figure 1). This can be explained by the formation of an antinode at the plucking point; if the string is plucked near its very edge, high frequencies will be present in the signal, as these frequencies result from standing waves on the string with much smaller wavelengths than the fundamental frequency. Because c is constant, these waves will have a high frequency contribution to the overall sound (recall that ). Interestingly, the formation of an antinode at the plucking point prohibits frequencies that require a node at that point; for instance, a string plucked in the middle (β = 0.5) will only contain the fundamental frequency, the third harmonic, the fifth harmonic, etc., as seen in figure 1.
Figure 1 (click to expand): Variations in acoustic spectrum due to different plucking ratios. The strings here had a fundamental frequency of 261.63 Hz (middle c). Note that β > 0.5 simply means that the plucking point was not on the same half of the string as the bridge (sound transmission point). For instance, a plucking ratio of 0.75 is the same as one of 0.25 due to the symmetry of the string and has no bearing on the acoustic spectrum.
Changing other parameters (R, L, and plucking amplitude) had no effect on the relative strengths of the frequencies in the spectrum, assuming that all other independent variables were held constant. This may indicate a weakness in the model, as builders avoid deviating from tried-and-true ratios to avoid producing unpleasant sounds (as well as to avoid combinations of R and L that would require tensions that break strings).
The importance of β in producing a particular acoustic spectrum is evident in instrument construction. Harpsichords often have two plucking distances per note from which the player can choose, depending on the desired tonal effect.
After I developed a computational model, I compared results to recordings of individual notes from a real harpsichord. The .wav files used were part of a freely available sound bank of an instrument, designed to be used without intent for profit in musical instrument emulation software (Bricet & Garet 2007). Thus, the quality of the samples is high. Figure 2 compares the sound of the real instrument with that of the model; though the latter is clearly digitally generated, the sound produced is at least qualitatively similar to the physical system it models.
Figure 2: Qualitative comparison of actual data vs. model-generated data, link here (Youtube videos do not come through for all users).
Quantitative analysis of the accuracy of the model vs. the physical data was not performed; this would have been essentially impossible without accurately knowing the plucking ratios built into the instrument that was recorded (there are no standards for β, which vary from one instrument to another and which are different note by note within the same instrument). However, I did plot generated acoustic spectra along side physical spectra in order to visually compare the model to the system (see figures 3 through 6). Though the relative strengths of the harmonic frequencies often mismatched, the model was very accurate in predicting which frequencies would be represented in a signal.
Figures 3 (at left) and (right): Model (blue) and physical data (red) acoustic spectra for the note one octave below middle c ( = 130.81 Hz). Figure 4 is simply a restriction of the data plotted in figure 3 to lower frequencies only. Some frequencies’ relative strengths matched well (e.g. second and third harmonics), while others were badly mismatched (e.g. first and fourth harmonics).
Figures 5 and 6: Similar to figures 3 and 4, but for the note two octaves above middle c ( = 1046.5 Hz).
Clavichord
The mechanism of the clavichord is unique among musical instruments. Instead of hammering, plucking, or bowing a string, the string is struck at one end by a blunt metal point (via a piano-like key). The metal point, called a “tangent,” serves as one end of the sounding portion of the string; in effect, the player directly controls the boundary conditions of the system and can even slightly adjust the tension at will. Because there is no plucking point, a new computational model needed to be implemented.
The difference between the clavichord and harpsichord required only a few changes in code. Instead of beginning the string as a triangle with a peak at the plucking point, the string began straight but with a negative slope. Upon running the program, one end of the string was raised over a length of time ; when this time had passed, the moving end of the string became stationary, held in place for the remainder of the simulation. The time evolution of the string, the force on the bridge, and the Fourier transform (acoustic spectrum) of a note on the clavichord can be seen in figure 7.
Figure 7: Time evolution of clavichord string, bridge force, and acoustic spectrum.
Though acoustic spectra could be produced as they were for the harpsichord, these would not be illuminating, as I do not have audio recordings of individual clavichord notes with which to compare. However, due to the unrealistic sound that the model generates (not provided here), I suspect that the model, which is likely a good start, may be inadequate for several reasons. The fact that the model string starts stationary on the bridge end (see figure 7) requires many time steps to “even out.” Thus, low frequencies which are not part of the actual sound may be over-represented (note the initially strong contributions by f = 0 Hz). Second, all real clavichords have two strings per note, which may interact in subtle but significant ways that this simple model does not address. Finally, the tension in a real clavichord string changes slightly as the key is depressed, and that is not accounted for in this model.
Conclusions
Though there are issues with the models, especially the one I designed for the clavichord, all were successfully implemented and, with further refinements and more accurate input parameters (which are not possible without access to the instrument recorded), these models could be applied to instrument design. For instance, one could determine the optimal dimensions to achieve a desired tonal quality from an instrument.
In summary, I tested the validity of a given model against actual data, varied input parameters to determine effects on the model’s output, and derived a new model to describe a system that, at this early stage, seems to be reasonable if not somewhat simplistic. Beyond describing Baroque keyboard instruments, which I personally found to be an interesting topic, this project was a general exercise in research on a small scale. I started with a model or derived my own, implemented the model computationally, generated computational data, and compared the results to experimental data.
In this section, we will be answering the questions we received from our last post.
It seems that not every neuron is connected to EVERY other neuron since there are different connection pattern.
When you say an energy of “-5000″ what are your units/reference point? I am still wondering how and why the Monte Carlo Method works and how the energy state is so low for ordered systems. This may be unrelated and somewhat random, however, why is it that entropy (disorder) in chemistry always increases and is actually considered a lower state of energy?
Every neuron in our neural network is connected to one another. The results of their connection can be found within the j-matrix; each pattern has it’s own j-matrix. When we store multiple patterns in one system, separate J matrices are created for each pattern, but the J matrix that is used (J_total) is the element-wise average of the separate J matrices. So, for each neural network there is only one J matrix, which describes the connections between each neuron and every other neuron.
When a pattern is greatly distorted it takes more energy to return it back to the desired pattern. However, entropy states that the greater the disorder the lower the state of energy. Our neural network is an artificial system that has no relations to entropy. Our energy state for ordered patterns is less than that of disordered patterns because that is the way our code is designed. Furthermore, our j-matrix is designed so that when we calculate the energy of stored patterns it gives us a large negative value for energy. However, when we calculate the energy of disordered patterns it gives us energies close to zero. The energy calculated in our neural network does not have units; it’s similar to intensity where we are just concerned with the relative energy between the neurons. The Monte Carlo method simply goes through a distorted pattern and determines whether or not a neuron needs to be flipped. This decision is based on the input of one neuron from the summation of the inputs of the other neurons within the neural network.
2. Analysis of our Neural Network:
Since our last post we have created neural networks with a larger number of patterns stored, in an attempt to study the system as it begins to fail correct memory recall. The way we accomplished this was by building systems with more letters as stored patterns. We had a system which stored A-C, one which stored A-D, and one that had A-F and also a lowercase letter a. Pictures of all of the letters we used as stored patterns are shown below.
Below are the 7 stored patterns within our neural network.
These systems (and links to their codes) are discussed below, but first a little background on the storage of many patterns in neural networks.
As explained in our previous posts, storing more patterns in a neural network causes these patterns to become more unstable: if you think of the energy energy landscape picture from our last post, the minima associated with each pattern become shallower the more patterns that are stored. This occurs because of the averaging of all the J matrices that correspond to the individual patterns that we want to store: each new pattern distorts parts of the other patterns. This can be seen visually in the pictures of the J matrices in our last post; the combination of A and B is much more complicated than A and B on their own.
Our textbook (Giordano and Nakanishi) talks about the limitations of how many patterns can be stored in a neural network. The main obstacles are that 1. any patterns that are too close to each other will likely interfere, and 2. there is a theoretical limit at which the system changes and all patterns become unstable.
For 1., think of the energy landscape again, as well as the letters we use. The minima for the letters B and D will be relatively close together on the energy landscape because they are relatively similar patterns, and thus their troughs will likely merge a bit and may produce patterns somewhere between the two. We run into exactly this problem with our A-D code, which often works for A and C (as long as they are not too distorted, usually less than 0.3 or so), but which usually returns a pattern somewhere between B and D when given distorted inputs of B or D.
If you want to try this out for yourself, use the code below.
The theoretical limit of the number of patterns that can be stored is given in the text as ~0.13N (in our case, 13 patterns). Our neural networks begin to function very poorly once we store 7 patterns (A, B, C, D, E, F, a); beyond simply confusing the letters that are similar, nearly all inputs lead to the same answer, a strange hybrid of letters (mainly F and B it seems), shown below.
This code actually does work for some inputs (if given an A distorted by 0.1 or less, successful recall of the letter A is usually achieved). However, nearly all inputs, even those unlike any other pattern (such as the lowercase a) give the same jumbled result seen above. This is likely a combination of the two effects mentioned above: many patterns here are similar to each other, and the number of patterns has significantly lessened the deepness of the minima associated with each pattern, leading to more instability across all of the stored patterns. Ideas for how to get real systems closer to this theoretical limit of 0.13N are discussed in Future Plans.
We were able to create a very functional neural network that stored three patterns, A-C, which avoided almost all of the problems of having patterns too similar to one another and having so many patterns that the energy landscape minima become too shallow. The link to this code is below.
We started this project in wanting to answer these questions:
How far away initial inputs can be from stored patterns while maintaining successful recall.
How many patterns can be stored in the neural network. The book discusses the maximums associated with this type of neural network, but we will investigate why this limit exists, as well as what kinds of behaviors change around this limit.
How long (how many Monte Carlo iterations) recall of stored patterns takes.
During the time spent on this project we were able to answer the above questions. However, we also ran into several unexpected problems and results. We found that the most we could distort a pattern is by roughly flipping 45% of the pattern, in order for our code to still work. Patterns that were distorted by 50% no longer worked and the image output was not recognizable. These numbers are based on a neural network with just three patterns: A, B, and C.
Several patterns can be stored in the neural network, however in order to have a neural network that works, we could only store 3 of our 7 patterns. This is so because after C, the letters become very similar to one another; for instance B and D or E and F. With these similarities the neural network produces output patterns that are half way between the similar letters, instead of one letter. If we had 7 patterns that were all drastically different from one another, we believe that our neural network would work.
The amount of Monte Carlo iterations is highly dependent on the amount of patterns stored in our neural network and by how distorted a pattern is. In our code we set a limit of 1000 iterations where the program stops if it is taking 1000 Monte Carlo sweeps to achieve the desired pattern. If a program takes 1000 iterations it means that the desired pattern we want is not going to be produced. This is where you get patterns that are incomplete or half way between two letters. When our neural network was successful it only took 1 Monte Carlo iteration to give us the desired pattern. Below is a picture of a distorted B and the output result after 1000 iterations, which is a pattern between B and D. As you can see the distorted B is very close to the letter B, however because this neural network has D stored as a pattern, it can not make up it’s mind as to which letter to display.
4. Future Plans:
One of the main things we would want to do next is to create a neural network with more patterns, approaching that theoretical limit of 0.13N. The best way to do this is likely with patterns that are more orthogonal than the letter patterns we used in this project. This would be easiest to accomplish with very simple but drastically different patterns, such as vertical lines or circles in different positions. With these new patterns, we would be able to uncover much more about our neural network than we can now with storing letter patterns that are very similar to one another.
Another objective we would want to tackle is how the size of the neural network affects its function. Specifically, I wonder if we used the same 7 patterns (letters A-F and lowercase a) but in a neural network that was 15×15 neurons (or even bigger), would we be able to get successful recall of the lowercase a, as we were unable to with our current 10×10 size network? More neurons (more importantly, a bigger J matrix) should be able to handle more patterns before the energy landscape minima become too shallow, so this should work, in theory. Testing this would provide us more insight into the limitations on the number of patterns that can be stored in a neural network.