Skype's voice over IP network ============================= Overview -------- - services to find other users, voice and video conference, media sharing, etc. - more than 600million users, $600billion per year, accounting for more than 1/8 of all international call minutes, typically over a hundred thousand calls active at any time - uses JoltID's global index system to find anyone who has been logged into Skype in the last 3 days. It is generally assumed that JoltID's distributed query system to handle this is based on a form of sharding. Approach to scalability ----------------------- - Skype maintains one central account system to track login ids and passwords, as well as a small number of default servers, but the vast majority of the system is a decentralized network run using client machine resources - Each client machine (with some limitations discussed below) might be used as a 'supernode': a small hosting or server node within the Skype network to carry out the global index system lookups (i.e. to host a shard) and to coordinate or forward call information for other clients. Picking clients to act as supernodes ------------------------------------ - generally a client will not be chosen to host a supernode if they are behind NAT, a firewall, or proxy server, nor will they be chosen if they are particularly bandwidth or cpu-limited Bypassing firewalls etc ----------------------- - Skype tries to make the process of placing calls as technically simple as possible for the user, and thus takes on responsibility for finding its own way past/through any firewalls, proxy servers, NAT, etc that might exist between the people taking part in a call - because the call traffic Skype generates is precisely the kind of traffic many systems seek to block or limit, it must take a variety of steps to get past/through those - on install, Skype picks a random port to use (hence firewalls etc can't block skype for all users simply by blocking a single known port) - the Skype client maintains (e.g. in the registry on Windows) a list of supernodes it knows about and has been able to access recently. On install this list is populated with a small set of servers run by Skype, but after that the list is populated with supernodes running on different client machines. On startup it goes through this list trying to access any one of those nodes to connect to the rest of the Skype network, check for updates, etc. This list is periodically rebuilt/refreshed, and if no supernodes can be accessed the client is unable to proceed. - on startup, skype also tries to assess what kind of obstructions there might be, and how best to bypass them, by trying a number of different connection ports and protocols until it finds something that gets through (e.g. it will try tcp and udp on its random port, on port 80 acting like http traffic, on port 443 acting like https, etc) - if necessary, Skype tries to masquerade as a flavour of 'accepted' traffic, setting up a connection in that form with a supernode, and getting that supernode to carry out searches and connection setup with other nodes to establish a call. Privacy and other concerns -------------------------- - Skype claims that the added bandwidth and cpu usage is 'negligible' for chosen clients, but there is contention from system administrators and other users - mistrust of Skype is aggravated by the way the skype application tries to mask itself in order to bypass firewalls and other 'obstructions', often causing administrators to liken Skype to a virus - there are also privacy concerns given that call and lookup information is being stored, albeit temporarily, on other peoples' machines