Sign Language Recognition (SLR) targets on facilitating the communication between deaf-mute people and ordinary people. This task is very challenging due to the complexity and large variations in hand postures. Some methods require user wear sensor gloves which can detect the position and angle of finger articulations. Others use RGB-D camera like Kinect to track hands and rely on complex algorithms to segment hands from background. However, all these methods have its own disadvantages. Sensor-based methods are not natural as the user must wear cumbersome instruments while camera-based methods have to design extra algorithms to track and segment hands from complex background. To address these problems, we propose a novel method for SLR which involves the use of the Real-Sense. It is a camera device which can detect and track the location of hands in a natural way. More powerful, it provides the 3D coordinates of finger joints in real time. We build a deep neural network (DNN) based on Real-Sense to recognize different signs. The DNN takes the 3D coordinates of finger joints as input directly without using any handcrafted features. The reason is that DNN, as a deep model, is capable of learning suitable features for recognition from raw data. In experiment, to demonstrate the effectiveness of Real-Sense, we collect two datasets by Real-Sense and Kinect respectively, then build DNNs based on each dataset for recognition. To validate the powerfulness of DNN, we compare the performance of DNN and support vector machine (SVM) on the same dataset.