Weekly Status Report #6: 10/20 – 10/27

Celine:

This week I worked on preparing material for our mid-semester demo and was able to test performance of warped vs. unwarped pages. To show the necessity of taking the step to dewarp pages, I have below a comparison of the text from warping a slightly curved page vs. the text of a relatively flat page.

The left column is a curved page, and the right is a relatively flat page. The source pages look like so:

Thus I believe it is beneficial to pursue the dewarping strategy of improving OCR. I was able to test dewarping of the left image using a program I found online, and it produced text like so:

While it still isn’t perfect, this dewarping definitely improved the performance. I did some research and found that Python has some autocorrect packages we can use to check the outputs that Tesseract produces. I’ve also concluded that the structure of our device needs to include some lighting, as this greatly enhances the program’s ability to threshold and process the pages.

This coming week I will complete my own dewarping program and start looking at implementing autocorrect!

Effie:

This was an exciting week! Following up on setting up the pi last week (formatting the SD card, installing raspbian, and registering the pi on cmu wifi), getting some books to try out from the library, and working with Indu on talking through the design for the stand and wheel-arms she’ll be building (and going on adventures to find scrap wood!), this week I had fun getting things moving! As more parts came in I worked on soldering the motor-hat pins and connecting it to the pi to connect and drive a stepper motor (for the conveyer belt) and the “teensy” dc motor (for the wheel). Additionally I found some drivers online and tweaked them to operate a servo. I am able to drive and control the two motors, servo, and camera all independently. It is possible we might need to buy a separate hat for the servos (not sure they can be running on the same pins as they are used by the motors)… I hope I’ll know by next week. Last week we had connected together all the camera extension parts only to find out that our 8MP camera wasn’t working since the connector soldering was messed up – but thankfully Sam was able to fix our camera! So now we are able to get great pictures! I am working on writing a script to automatically take and save pictures at pre-determined time intervals (to then send off to Celine’s code for processing to text). I met up as well with this week with Greg Armstrong in the Robotics Institute who gave me valuable advice on how to potentially operate our wheel-arms.

So next week I want to work on a few things: I plan on integrating my code to speak to several of the components at once, to work with Indu on physically connecting components together for a prototype of the arm she is designing, and I will attempt to figure out how to motorize the camera we bought to programatically zoom/pan – though I fear it won’t be possible (and might not be necessary anyways), but would be cool if I could!

 

Indu:

This week I primarily worked on building out the page-turning device. I spent a few hours with a mechanical engineering student, Krishna Dave, and talked out what I had drawn out for the design of the entire device, to ensure I was thinking about everything properly and had it mapped out well. From there I started constructing the turning part with the gear. I went to the Maker Space and talked with an employee for a while about the best way for me to mount the motor hub into the gear (the hub’s diameter is larger than the gear’s diameter). Since I do not have machine shop training I left those parts with the employee for him to provide this service when possible.

I was originally planning on building the stand as well, but my mechanical engineering friend suggested I use a tripod to mount the camera instead of building another part of the system, at least for now, since that will be height-adjustable. Due to this advice I went to Hunt Library and looked at their assortment of tripods and found one that may possibly work for the design. I have yet to test this with the camera, as I need to mount the camera on wood before I can mount it to the tripod, which will be done this weekend.

I plan on continuing the building process of the device, as there is still more thinking that needs to go into the wheel-part of the page separation device, due to after talking with my mechanical engineering friend, seems to be that we should focus less on trying to have the wheel work by gravity.

Weekly Status Report #5: 10/13 – 10/20

Celine:

This week, my groupmates and I wrote up our design paper together. I was sick for the most of this week, so have not been able to make significant progress. The progress that I did make was testing out text detection using a pre-trained EAST text detector convolutional neural network, in order to try to segment text from page sets with images. The results were not what I wanted though, so for this problem I will need to try something else. For now, the data I have tried so far has worked well using pytesseract, as it will ignore the illustration on the page (please see the second figure below).

I am in the middle of trying to implement some text/page skew correction to try to improve the output from using pytesseract. When I input an image that is clean and not skewed, with high resolution, pytesseract works very well:

However, when used on an actual image that I took with my phone, I get results like this (see linked image for better resolution):

I noticed that “Stuart” is recognized incorrectly when it is along the curved part of the page, but recognized correctly when it is on the flatter portion of the page, so I am hoping that some skew correction will improve recognition.

I hope to have this skew correction completed this weekend, and in the coming week will implement some image processing such as binarization to see if that will improve accuracy as well. During this coming week I hope to have a working python script that takes an image of a page set against a black background and have it perform with better accuracy than it does in the image shown above.

Effie:

 

 

Indu:

This week I worked more on the design of the page turning device, specifically how the wheel and gear should connect to the motor in order for it all to work for turning the page, by having drawings of the potential design being that it would include a pivoting mechanism in order to allow it to be lifted up when the page is being turned.

This Wednesday, our Raspberry Pi came in so Effie and I spent majority of class time, setting it up and further discussing how we think the Pi will be used to operate all the different components of the device, as it will involve both of us to integrate the mechanics of the device with the Pi. Also Effie went to the library and got us our test base for the books. We all spent the rest of the week trying to use the Arducam with the Pi in order to take pictures of the books, but kept getting that the camera was undetected, so we think the Arducam may be faulty. Celine contacted Arducam to ask about the issue we were having so hopefully we get a helpful response soon.

In terms of next steps, next week I will work on building a mock version of the stand and the page-turning device so that Effie and I can connect various parts of our device (e.g. the wheel for page-separating, the gear for page-turning) to the Pi in an attempt to make each part work individually. We would also like to test other page-turning methods, as we stated earlier that while we think the conveyor chain method is the gentlest, we want to test this to actually know for sure.

Weekly Status Report #4: 10/6– 10/13

Celine:

The beginning of this week I mostly worked on the design presentation slides with Effie and Indu. For the computer vision work, I made progress getting Tesseract installed and getting it to work on some unprocessed images I just got off of the internet. My teammates and I have been working on the design paper too, which we will finish tomorrow.

Next week I plan to get into being able to segment a page/set of pages from the background and look into ways to improve Tesseract’s accuracy.

Effie:

This week I worked on my slides of the in-class presentation and several parts of the design paper. I met with a friend who had done a similar page turning project to get advice on how to build a page lifting mechanism with a servo arm and motor wheel which was very helpful.

This coming week we plan to finish the design paper, and hopefully the pi will come in soon so I can get going setting it up and connecting devices.

Indu:

This week I worked on the Design Presentation with Celine and Effie. It involved me thinking more along the lines of what our final device should look like. I also spent a significant amount of time working on a drawing of what the whole device should look like. As we only have one of our conveyor chain gears, I spoke with a Mechanical Engineering friend and we may laser cut another gear this weekend. I also worked significantly on our Design Report and helped Celine further flush out the block diagram so that more of the technical specifications are included.

Weekly Status Report #3: 9/29 – 10/6

Celine:

This week I finished setting up OpenCV on my laptop and was able to run OpenCV with both python and C++ programs. In addition to the setup, I further researched some methods we have to use to perform image processing on document images: document alignment using edge detection, which can be done with imutils in OpenCV, and image binarization by reading the image as grayscale then applying a threshold. There is the problem of segmenting apart the two pages in the image, so I am thinking to possibly just sum the image into a histogram to do this. For the gear replication, we were not able to work on it this week, but we have decided that we will laser cut the gear instead of 3D printing it, as it will be faster and produce a smoother gear. Finally, I wrote up a draft spreadsheet detailing what needs to be done in each component of the project.

This coming week, I will be working on the design presentation with Indu and Effie. I will also implement the above mentioned image processing methods on some experimental data. I will also get the gear designed and cut with Indu.

Effie:

This week I (hopefully) finalized our parts list and ordered the motors, servo, pi + sd card, pi hat to connect to motors and extensions, and wheels and mapped out how we will connect all the pins. While we await the parts arrival next week we will work on the design proposal, and do research on how to operate the motors with respective drivers and code up some pwm logic – so we’ll be ready to hit the ground running when they come in.

Indu:

This week we got our initial parts, which were mainly the Arducam, gear, and conveyor chain. After heavily analyzing all the parts we received on Monday with Celine, we worked on further flushing out what each part of our block diagram will consist of and involve.

This week I worked with Effie on further planning out how all the other parts will be working together, such as the Raspberry Pi and motors for our wheel and conveyor chain. We also decided to buy a servo motor as a backup method of turning the page if the conveyor chain method proves to not work the way we thought. This involved working with Effie to make sure each part we ordered can be integrated with all the others. It also involved me working on a rough sketch of what the whole design should look like to test the page turning part of the project. I want to make sure that the dimensions of the stand and conveyor chain will be practical for a binder, as that is what we want to test with first.

This coming week I will be working with my team on the design presentation. I will also be working with Celine on designing and cutting the other gear for the conveyor chain part of the design. If we receive our parts this coming week, then hopefully I will be able to build a prototype of the page-turning device.